TAF2 OE AS analysis

1 Intro

Alternative splicing (AS) events included in this analysis:

  1. alternatively spliced exons and microexons

  2. intron retention

  3. alternative splice sites (acceptor at 3’ and donor at 5’)

For each sample AS events were quantified using the percentage of sequence inclusion (PSI) metric, which is a number ranging from zero to 100, corresponding to full sequence skipping or full sequence inclusion in all transcripts of a gene, respectively. For example, constitutively spliced exons have a PSI of 100. The PSI cannot be calculated for the very first and last exons of a gene.

Last code execution: 2025 March 03, Monday @ 17:03:23.

These documents reports the downstream AS analysis after processing the samples with VAST-TOOLS(Tapial et al. 2017) using the following steps:

2 FASTQ files processing

vast-tools(Tapial et al. 2017) was used to quantify both AS events PSI and gene expression as normalised TPMs using limma::normalizeBetweenArrays and cRPKMs per gene. The analysis was performed using vast-tool v2.5.1.

The raw FASTQ files were processed as following:

2.1 align

vast-tools maps the RNA-seq reads to a predefined set of exon-exon junctions (EEJ). This set of EEJ is an “enriched” version of all ENSEMBL transcripts. This dataset was analysed using Homo sapiens assembly hg38, based on ENSEMBL v88. To do so vast-tools align first trims the reads to 50 nucleotides fragments with a sliding window define by --stepSize (usually 25 or 20 nt). The trimmed reads are aligned to the EEJ libraries using bowtie v1.2. The trimmed reads must have at least 8 nucleotides overlapping the splice site and be uniquely mapping to be considered for quantification.

For this analysis fastq files were processed with vast-tools align with options --sp hg38, ---IR_version 2, --stepSize 20, and --expr. Reads strand detection is done automatically.

2.2 merge

To increase sequencing read depth the vast-tools has a module called merge that pools the individual replicates of each condition into a super sample. This is basically equivalent to concatenating FASTQ files with cat before running align. This helps prevent biases in AS events identification towards highly expressed genes. For this analysis the 3 individual replicates per condition were merged.

2.3 combine

To build the main output containing all detected AS events the vast-tools combine module takes intermediate output files and generate a table named as INCLUSION_LEVELS_FULL-<SPECIES_ASSEMBLY>-<SAMPLES_NUM>-v251.tab. This file contains an AS event on each row, the first 6 columns contain basic information about the event (e.g. gene name, length, and coordinates) followed by a column with the PSI and a quality score column for each sample. Further details about the comma separated quality scores can be found here. This step also generates a similar table gene expression counts expressed as cRPKM or TPM.

All individual replicates and the merged super samples were combined with vast-tools combine with options --add_version, --TPM and --norm for gene expression. For this analysis the output file has 721551 AS events.

2.4 compare

To calculate the differential inclusion level (∆PSI) between samples the combined table was processed with vast-tools compare using the HeLa FRT samples as control (-a) and the following filtering parameters: --min_dPSI 15, --min_range 0, --max_dPSI 5, --print_sets, --noB3, --p_IR, --print_dPSI, and --GO. HeLa TAF2 OE and NLS-TAF2 ∆IDR OE were each compared to the control separately. The comparison was done using the 3 individual replicates per condition or by using only the merged super sample. This is denoted in the file names as _uniq_ or _mrgd_.

2.5 tidy

To simplify and filter the full inclusion table the module vast-tools tidy produces a table with only events that have a minimum coverage that are more likely to relevant for downstream analysis.

Full inclusion table was filtered using vast-tools tidy with: --noB3, --p_IR, --add_names, and --log. The tidy module has an option to discard the VLOW events (see below). Here since there are replicates the VLOW events are include in the analysis.

2.6 diff

Lastly, a statistical differential splicing analysis was performed with vast-tools diff using Bayesian inference. Briefly , it calculates PSI posterior distributions using Bayes’ theorem and employs replicates to estimate joint posterior distributions. The statistical significance between two posterior distributions is determined by comparing their differences using empirical distributions (Han et al. 2017) (Weatheritt, Sterne-Weiler, and Blencowe 2016).

The vast-tools diff module was run with options: -n 10000 (lines to process in parallel), -m 0.15 (min ∆PSI/100) -e 10 (minimum reads), -z 16 (random seed), and --noPDF (do not plot the estimates).

3 Selecting differentially spliced events

I use 3 different methods to select differentially spliced events (DSEs).

3.1 Effect size method

The compare module pre-filters the events based on read coverage imbalance and other features. The events are then simply filtered by calculating average and individual ∆PSIs. Between control and test samples the PSI distribution must be non overlapping, i.e. at least 10 PSI difference between the maximum and minimum PSI of the 2 groups (--min_range 10).

Note

compare is a non-statistical approach

Differentially spliced events have a minimum |∆PSI| >= 15. This might have the caveat of missing AS events in lowly expressed genes.

3.2 Frequentist inference method

Using the inclusion table generated by the tidy module, the events were further filtered to have coverage in at least 2 samples per condition and statistical difference between the average PSI in each condition was calculate using the Student’s t-test followed by multiple hypothesis testing (BH) correction using rstatix::t_test in R.

Events with a P-value <= 0.01 and a |∆PSI| >= 10 were considered significant. The non parametric Mann-Whitney (rstatix::wilcox_test) was also used to check if not assuming normality impacts the results drastically in a separate appendix.

3.3 Bayesian inference method

The module diff uses:

  1. A uniform prior distribution Beta(α = 1, β = 1).

  2. A binomial likelihood function for the number of inclusion reads K ~ Binomial(Ψ, N).

where Ψ represents PSI for any AS event and N is the total number of junction reads per event. The formula K ~ Binomial(Ψ, N) means that the number of inclusion reads (K) is assumed to follow a binomial distribution with Ψ as the probability of inclusion (i.e., success) and N as the total number of junction reads per splicing event.

This prior distribution represents our initial beliefs about the PSI before observing any data, while the likelihood function represents the probability of observing the data given a specific PSI value. The likelihood function follows a binomial distribution, which is appropriate for count data like the number of inclusion reads.

According to Bayes’ theorem, our prior beliefs about PSI are updated based on observed data to obtain the posterior distribution. In this case, the prior distribution (uniform Beta) and the likelihood function (binomial) are “conjugate” to each other, which means that the posterior distribution can be expressed in a closed form using the same type of distribution as the prior. Therefore the posterior distribution over Ψ, is represented as a Beta distribution with parameters Ψ ≈ Beta(K + α, (N – K) + β).

If replicates are available, joint posterior distributions for a sample are estimated from sampling empirical posterior distributions of the replicates and fitting a new posterior Beta with maximum likelihood estimation (MLE) using the MASS::fitdist function in R. This basically means combining the information from multiple replicates to get a more robust estimate of the PSI. The statistical significance between two biological conditions, defined as the level of confidence in determining whether two posterior PSI distributions are significantly different from each other, modelled as two posterior distributions X~ Beta, and Y ~ Beta, is calculated as P(X – Y > 0), i.e., X is higher than Y. This probability can be estimated from the difference of empirical distributions sampled between X and Y such that P(X – Y > 0) =\(\sum_{i=1}^{n} (Xi –Yi > 0) / N\). Lastly, significantly differential events were additionally required to have a PSI difference > 10.

Using the diff module of vast-tools filter for events with a Minimum Value (MV) of |∆PSI| at 95% confidence >= 15%. This means that each event as a 95% probability to have a |∆PSI| between the 2 conditions higher or equal to 15% (MV).

4 Set up

4.1 Packages

Load common packages that have to be pre-installed. Check Section 9 for package versions.

Code
library(ggplot2)
library(ggrepel)
library(ggforce)
library(pheatmap)
library(UpSetR)
library(viridis)
library(rstatix)
library(Cairo)
library(crayon) # for coloured text.

In addition I developed an R package called niar to stream line some of the common analysis steps. It can be installed from my GitHub repository using:

Code
devtools::install_github("Ni-Ar/niar")
library(niar)

Here I load the package from my local repository with:

Code
local <- FALSE
if(local) {
    devtools::load_all(path = '~/mnt/narecco/software/R/niar')
} else if (local == FALSE) {
     devtools::load_all(path = '~/software/R/niar')
} else {
  stop('local is not a Logical')
}

4.2 Functions

Colour palettes:

Events coverage quality score colour palette.

Code
quality_score_colors <- c('#ffffcc', '#c2e699', '#78c679', '#31a354', '#006837')
quality_score_values <- c("N", "VLOW", "LOW", "OK", "SOK")
names(quality_score_colors) <- quality_score_values

Tanja’s conditions colour palette

Code
sample_palette <- c('#009900',  '#663399', '#FFCC33')
names(sample_palette) <- c('TAF2', 'TAF2∆IDR', 'NLS-TAF2∆IDR')

Some generic functions I wrote for this analysis:

A plotting function to explore sample’s ∆PSI of exons and introns

Code
#' Group differentially spliced exons and introns
#'
#' @param data A data frame with the columns `dPSI`, `COMPLEX`, `comparison`
#'
#' @return A ggplot
plot_dPSI_hist_grpd <- function(data) {
  
  exon_type <- c("S", "C1", "C2", "C3", "ANN", "MIC")
  intron_type <- "IR"
  alt_ss <- c("Alt3", "Alt5")
  
  data |>
    select(dPSI, COMPLEX, comparison) |>
    mutate(TYPE = case_when(COMPLEX %in% exon_type ~ 'Exon',
                            COMPLEX %in% intron_type ~ 'Intron',
                            COMPLEX %in% alt_ss ~ 'Alt. s.s.',
                            ) ) |>
    
    # subset(! COMPLEX %in% c('Alt3', 'Alt5') ) |>
    mutate(TYPE = factor(TYPE, levels = c('Exon', 'Intron', 'Alt. s.s.'))) |>
    unique() |>
    ggplot() +
      aes(x = dPSI, fill = dPSI > 0 ) +
      facet_grid(TYPE ~ comparison, scales = 'free_x',
                 labeller = labeller(comparison = c(TAF2dIDR = 'NLS-TAF2 ∆IDR',
                                                    TAF2 = 'TAF2') ) ) +
      geom_histogram(binwidth = 1, show.legend = F) +
      scale_x_continuous(n.breaks = 10) +
      scale_y_continuous(n.breaks = 4, expand = expansion(mult = c(0, 0.01))) +
      scale_fill_manual(values = c("TRUE" = "firebrick3", "FALSE" = "dodgerblue3")) +
      labs(x = "\u0394PSI (OE - Cntrl)", y = "Num. of events") +
      theme_classic(base_family = 'Arial', base_size = 6) +
      theme(legend.position = 'none', 
            plot.background = element_blank(),
            panel.background = element_blank(), 
            panel.grid.major = element_line(colour = 'gray84', linewidth = 0.1),
            panel.grid.minor.y = element_blank(),
            panel.border = element_blank(),
            axis.text = element_text(colour = 'black'), 
            strip.background = element_blank() ) -> cmpr_dPSI_Hist
  return(cmpr_dPSI_Hist)
}

QC plot with the number of events within each quality score

Code
#' Show the number of events by quality score in a stacked bar plot
#'
#' @param data A dataframe with the complex `COMPLE` and `Quality_Score_Value`
#'
#' @return A ggplot
plot_quality_score_stacked <- function(data) {
  ggplot(data) +
    aes(x = COMPLEX, fill = Quality_Score_Value) +
    geom_bar(colour = 'black', linewidth = 0.2) +
    scale_fill_manual(values = quality_score_colors, name = "Score") +
    scale_y_continuous(expand = expansion(mult = 0, add = 1), n.breaks = 10) +
    labs(x = 'Type of AS event', y = 'Number of AS event') +
    theme_classic(base_family = 'Arial', base_size = 6) +
    theme(legend.position = c(0.45, 0.95),
          legend.key.size = unit(2, units = 'mm'), 
          legend.direction = 'horizontal',
          plot.background = element_blank(),
          panel.background = element_blank(), 
          panel.grid.major.y = element_line(linewidth = 0.2, colour = 'grey84'),
          panel.grid.major.x = element_blank(),
          panel.grid.minor.y = element_blank(),
          panel.border = element_blank(),
          axis.text = element_text(colour = 'black'),
          axis.ticks = element_line(linewidth = 0.2),
          axis.ticks.x = element_blank(),
          axis.line = element_line(linewidth = 0.2)) -> cmpr_Scores_Bars
  return(cmpr_Scores_Bars)
}

Wrapper to explore the PSI and quality scores of specific events.

Code
plot_PSI_boxplot <- function(df, num_col = 5, order_by_dPSI = T) {

  if (order_by_dPSI) {
    df <- df |>
    mutate(EVENT = fct_reorder(EVENT, dPSI)) 
  
    gene_order <- unique( df[order(df$EVENT, decreasing = T), ]$GENE)
    
    df <- df |>
      mutate(GENE = factor(GENE, levels = gene_order))
  }
  
  ggplot(df) +
    aes(x = OE, y = PSI) +
    facet_wrap( ~ GENE + EVENT, scales = "free_x", ncol = num_col) +
    geom_boxplot(fill = NA, outlier.shape = NA, lwd = 0.2) +
    geom_point(aes(fill = Quality_Score_Value, shape = Replicate),
               size = 1.5, stroke = 0.2, 
               position = position_dodge(width = 0.5)) +
    scale_shape_manual(values = c('Merge' = 22, 'A' = 21, 'B' = 21, 'C' = 21)) +
    scale_fill_manual(values = quality_score_colors, name = "Quality\nScore") +
    scale_x_discrete(labels = c('FRT' = 'Cntrl', 'TAF2' = 'TAF2', 'NLSTAF2dIDR' = 'NLS-TAF2∆IDR') ) +
    scale_y_continuous(breaks = seq(0, 100, 10), 
                       expand = expansion(mult = 0, add = 2)) +
    guides(fill = guide_legend(override.aes = list(shape = 21)),
           shape = 'none') +
    coord_cartesian(ylim = c(0, 100), clip = 'off') +
    labs(y = 'PSI') +
    theme_classic() +
    theme(axis.text = element_text(colour = "black"),
          axis.line = element_line(linewidth = 0.2),
          axis.title.x = element_blank(),
          strip.background = element_blank(),
          strip.text = element_text(vjust = 1, lineheight = 0, 
                                    margin = margin(t = 0, unit = 'mm')),
          panel.grid.major = element_line(colour = 'gray84', linewidth = 0.15),
          legend.position = c(0.93, 0.1) ) 
}

Calculate z-score on a numeric matrix

Code
zScore <- function(mat) {
  row_means <- apply(mat, 1, mean)
  row_sd <- apply(mat, 1, sd)
  # handle degenerate distribution in which the denominator (row sd) of the z-score formula is zero. 
  # the z-score becomes undefined as the denominator of the z-score formula is zero.  
  # this is a rare situation in general. Here I fix it by diving by 1 instead.
  if( any(row_sd == 0) ) { row_sd[which(row_sd == 0)] <- 1 }
  zScore_mat <- (mat - row_means) / row_sd  
  return(zScore_mat)
}

Alternative formula to z-score that used median

Code
zMad <- function(mat) {
  row_medians <- apply(mat, 1, median)
  row_median_abs_dev <- abs(mat - row_medians)
  median_row_median_abs_dev <- apply(row_median_abs_dev, 1, median)
  # conceptually row_mad is the sd. 1.482 is a constant linked to the assumption of normality of the data
  row_mad <- median_row_median_abs_dev * 1.4826
  
  # If row max is zero fix it by diving by 1 instead.
  if( any(row_mad == 0) ) { row_mad[which(row_mad == 0)] <- 1 } 
   
  zMad_dev <- (mat - row_medians) / row_mad
  return(zMad_dev)
}
# mat <- head(tidy_full_mat)
# zMad(mat)
# zScore(mat)

Function I wrote to make sure alternative donors and acceptors are not repeated between up and down regulated sets (used in tidy analysis).

Code
#' Match Alternative 3' or 5' Splice Site IDs
#'
#' This function takes two sets of upregulated and downregulated alternative 3' or 5' splice site IDs 
#' (VAST-TOOLS format) and resolves them by ensuring the IDs are correctly matched, taking into account 
#' variations in the alternative splicing.
#'
#' @param up_ids A character vector of upregulated splice site IDs (VAST-TOOLS format).
#' @param do_ids A character vector of downregulated splice site IDs (VAST-TOOLS format).
#' @return A character vector of resolved and unique splice site IDs.
#' @examples
#' up_ids <- c("HsaALTA0001463-1/2", "HsaALTA0006052-1/2", "HsaALTA0006256-2/2",
#'             "HsaALTA1001090-3/3", "HsaALTA1015934-2/4")
#' do_ids <- c("HsaALTA0001463-2/2", "HsaALTA0001572-1/5", "HsaALTA0006052-2/2",
#'              "HsaALTA0006256-1/2", "HsaALTA0007106-4/5", "HsaALTA0008669-2/4", "HsaALTA1001090-2/3")
#' match_alt35_id(up_ids, do_ids)
#' # Returns: "HsaALTA0001463-1/2" "HsaALTA0006052-1/2" "HsaALTA0006256-2/2"
#' #          "HsaALTA1001090-3/3" "HsaALTA1015934-2/4" "HsaALTA0001572-1/5" 
#' #          "HsaALTA0007106-4/5" "HsaALTA0008669-2/4"
match_alt35_id <- function(up_ids, do_ids) {
  # Calculate the number of unique IDs combining both upregulated and downregulated sets
  all_up_do_ids_len <- length(unique(c(up_ids, do_ids)))
  
  # Remove specific splice variation suffix (-x/y) and then calculate the number of unique IDs
  sam_up_do_ids_len <- length(unique(c(str_remove(string = up_ids, pattern = '\\-[1-9]/[1-9]'),
                                       str_remove(string = do_ids, pattern = '\\-[1-9]/[1-9]'))))
  
  # Case 1: If the number of unique IDs remains the same after removing variations, return the unique set
  if (all_up_do_ids_len == sam_up_do_ids_len) {
    return(unique(c(up_ids, do_ids)))
  } 
  # Case 2: If the number of unique IDs differs, identify and filter specific splice IDs
  else if (all_up_do_ids_len > sam_up_do_ids_len) {
    
    # Check if the downregulated set has more or equal IDs compared to the upregulated set
    if (length(do_ids) >= length(up_ids)) {
      # Identify and retain downregulated splice IDs that do not match upregulated IDs without the suffix
      indx_do_ids <- which(!str_remove(string = do_ids, pattern = '\\-[1-9]/[1-9]') %in%
                           str_remove(string = up_ids, pattern = '\\-[1-9]/[1-9]'))
      do_ids <- do_ids[indx_do_ids]
      return(unique(c(up_ids, do_ids)))
    }
    
    # Check if the upregulated set has more or equal IDs compared to the downregulated set
    if (length(up_ids) >= length(do_ids)) {
      # Identify and retain upregulated splice IDs that do not match downregulated IDs without the suffix
      indx_up_ids <- which(!str_remove(string = up_ids, pattern = '\\-[1-9]/[1-9]') %in%
                           str_remove(string = do_ids, pattern = '\\-[1-9]/[1-9]'))
      up_ids <- up_ids[indx_up_ids]
      return(unique(c(up_ids, do_ids)))
    }
  }
}

4.3 File paths

Files and directories paths. All files are backed-up on the CRG cluster.

Code
if(local) {
    proj_dir <- file.path('~/mnt/narecco/projects/01_ALTdemix')
} else if (local == FALSE) {
    proj_dir <- file.path('~/projects/01_ALTdemix')
} else {
    stop('local is not a Logical')
}
proj_dir <- normalizePath(proj_dir)
expr_dir <- file.path(proj_dir, 'data/INCLUSION_tbl/Tanja')
vast_tools_dir <- file.path(expr_dir, 'vast_tools')
vast_out <- file.path(expr_dir, 'vast_tools/vast_out')

psi_path <- file.path(vast_out, 'INCLUSION_LEVELS_FULL-hg38-12-v251.tab')
expr_path <- file.path(vast_out, 'TPM-hg38-12-NORM.tab')

cmpr_dir <- file.path(vast_out, 'compare_2023_08_01/min_dPSI15_min_range0_max_dPSI5')
cmpr_TAF2_uniq_path <- file.path(cmpr_dir, 'HeLa_TAF2OE_vs_CNTRL_uniq_noB3_pIR.tab')
cmpr_TAF2dIDR_uniq_path <- file.path(cmpr_dir, 'HeLa_NLSTAF2dIDR_vs_CNTRL_uniq_noB3_pIR.tab')

cmpr_TAF2_mrgd_path <- file.path(cmpr_dir, 'HeLa_TAF2OE_vs_CNTRL_mrgd_noB3_pIR.tab')
cmpr_TAF2dIDR_mrgd_path <- file.path(cmpr_dir, 'HeLa_NLSTAF2dIDR_vs_CNTRL_mrgd_noB3_pIR.tab')

tidy_dir <- file.path(vast_out, 'tidy_2023_08_03')
tidy_psi_path <- file.path(tidy_dir, 'INCLUSION_LEVELS_TIDY_HeLa_noB3_pIR.tab')

diff_dir <- file.path(vast_out, 'diff_2023_06_20/min_dPSI_min_reads10')
diff_TAF2_path <- file.path(diff_dir, 'Fltrd_15_HeLa_TAF2OE_minRead10_noB3_pIR.tab')
diff_TAF2dIDR_path <- file.path(diff_dir, 'Fltrd_15_HeLa_NLSTAF2dIDR_minRead10_noB3_pIR.tab')

plot_dir <- file.path(expr_dir, 'plots/TAF2-OE')
if ( !dir.exists(plot_dir) ) { dir.create(path = plot_dir, recursive = T) }

dse_IDs_dir <- file.path(expr_dir, 'diff_spliced_IDs/TAF2-OE')
if ( !dir.exists(dse_IDs_dir) ) { dir.create(path = dse_IDs_dir, recursive = T) }

5 Differential splicing analysis

Select the differentially spliced events based on the 3 different methods described above. This is the core part of this report.

5.1 compare

Import all the tables from the vast-tools compare analysis in a list.

Code
lapply(list.files(cmpr_dir, full.names = T, pattern = '^HeLa_.*_noB3_pIR.tab$'), function(x) {
  read_vst_tbl(path = x, show_col_types = FALSE) |>
    tidy_vst_psi() |>
    mutate(comparison = gsub(pattern = '_noB3_pIR.tab', replacement = '', x) ) |>
    mutate(comparison = gsub(pattern = paste0(cmpr_dir, '/'), replacement = '', comparison) ) |>
    mutate(comparison = gsub(pattern = "HeLa_", replacement = '', comparison )) |>
    mutate(comparison = gsub(pattern = "vs_CNTRL_", replacement = '', comparison ))
}) -> list_compares

Bind the list of dataframes into one single dataframe.

Code
cmpr_events <- do.call('rbind', list_compares) 

Create a metadata dataframe from the samples’ names.

Code
cmpr_events |>
  subset(comparison == 'TAF2OE_uniq') |>
  select(Sample) |>
  unique() |>
  mutate(CellType = str_split_fixed(string = Sample, pattern = "_", n = 3)[,1]) |>
  mutate(OE = str_split_fixed(string = Sample, pattern = "_", n = 3)[,2]) |>
  mutate(Replicate = str_split_fixed(string = Sample, pattern = "_", n = 3)[,3]) |>
  mutate(Replicate = ifelse(Replicate == '', yes = 'Merge', no = Replicate )) |>
  mutate(PrettySample = str_remove(string = Sample, pattern = 'HeLa_') ) -> mtdf

Add metadata info to the compared events table.

Code
cmpr_events <- cmpr_events |>
  left_join(y = mtdf, by = join_by(Sample)) 

5.1.1 Events coverage QC

Note

Not all AS events are easily measurable and not all events have the same sequencing coverage.

The AS events quantified by vast-tools are divided in different subgroups based on their complexity, which is defined in the 6th column (COMPLEX) of the compare table. This complexity resemble the underlying exon-intron structures in the gene body. The complexity of an AS exon was established based on Score 5 from the quality column of the inclusion table. It summaries the number of reads that come from the “reference” C1-A, A-C2 and C1-C2 junctions for a wide panel of RNA-seq samples (see Irimia et al. 2014 for further information). The complexity score is the same for all the samples for any given event.

Namely:

  • S, C1, C2, C3 are types of exon skipping events quantified by the splice site-based or transcript-based modules, with increasing degrees of complexity (e.g. S = simple; C3 more complex than C2).
  • ANN are exon skipping events quantified by the ANNOTATION module. Their IDs start with 6 or 7 (e.g. HsaEX6000001 or HsaEX7000001). These are exons from the reference GTF annotation used to build VAST-TOOLS. However, some annotated events are not present, because of low mappability or reads imbalance.
  • MIC are exon skipping events quantified by the microexon pipeline.
  • IR are intron retention event.
  • Alt3 are alternative acceptor events at the 3’ splice site of an exon.
  • Alt5 are alternative donor events at the 5’ splice site of an exon.

In addition, an AS event can be measured at different sequencing depths. In vast-tool this is resembled by Score 1 of the quality column. Each event in each samples is given a score based on the actual read counts, before mappability correction. This coverage scores are: N, VLOW, LOW, OK, and SOK, (where N = Not meeting minimum threshold, VLOW = Very low and SOK = Super okay). In general anything above N can be considered reliable and worth considering, but VLOW events might have an actual different PSI value if confirmed experimentally with a PCR on RT-cDNA.

Here I check if the quality score of the events changes when including or excluding the super sample.

Code
ggplot(cmpr_events) +
  aes(x = Quality_Score_Value, fill = comparison) +
  facet_grid(~COMPLEX)  +  
  geom_bar(position = position_dodge(), width = 0.75, colour = 'black', linewidth = 0.1) +
  scale_fill_manual(values = c('#FFCC33', 'darkgoldenrod', '#009900', 'green4'), 
                    name = 'compared against control') +
  scale_y_continuous(expand = expansion(add = c(0, 1)), n.breaks = 10) +
  labs(x = 'AS event quantification quality score', 
       y = 'Number of AS events') +
  theme_bw(base_family = 'Arial') +
  theme(legend.position = 'bottom',
        axis.text = element_text(colour = 'black'),
        axis.text.x = element_text(angle = 45, hjust =1),
        axis.ticks.x = element_blank(),
        strip.background = element_blank(),
        plot.background = element_blank(),
        legend.key.size = unit(x = 3, units = 'mm'),
        legend.background = element_blank(),
        legend.margin = margin(t = 0, unit = 'mm'),
        panel.grid.minor = element_blank(),
        panel.grid.major.x = element_blank(),
        panel.grid.major.y = element_line(linewidth = 0.1, colour = 'gray84'),
        panel.background = element_blank(),
        panel.border = element_rect(linewidth = 0.2, colour = 'black')) -> p_QC_events_by_type
p_QC_events_by_type

As one can see there’s a gain in lower quality events if including the merged replicates super samples. This behaviour is kind of expected.

Code
ggsave(filename = '01_QC_AS_Events_by_Type_Coverage_Barplot.pdf', plot = p_QC_events_by_type, 
       device = cairo_pdf, path = plot_dir, units = 'cm', width = 30,
       height = 10)

When filtering out for N events…

Code
fltrd_cmpr_events <- subset(cmpr_events, as.integer(Quality_Score_Value) >= 2)

…and repeating the plot.

Code
ggplot(fltrd_cmpr_events) +
  aes(x = Quality_Score_Value, fill = comparison) +
  facet_grid(~COMPLEX)  +  
  geom_bar(position = position_dodge(), width = 0.75, colour = 'black', linewidth = 0.1) +
  scale_fill_manual(values = c('#FFCC33', 'darkgoldenrod', '#009900', 'green4'), 
                    name = 'compared against control') +
  scale_y_continuous(expand = expansion(add = c(0, 1)), n.breaks = 10) +
  labs(x = 'AS event quantification quality score', 
       y = 'Number of AS events') +
  theme_bw(base_family = 'Arial') +
  theme(legend.position = 'bottom',
        axis.text = element_text(colour = 'black'),
        axis.text.x = element_text(angle = 45, hjust =1),
        axis.ticks.x = element_blank(),
        strip.background = element_blank(),
        plot.background = element_blank(),
        legend.key.size = unit(x = 3, units = 'mm'),
        legend.background = element_blank(),
        legend.margin = margin(t = 0, unit = 'mm'),
        panel.grid.minor = element_blank(),
        panel.grid.major.x = element_blank(),
        panel.grid.major.y = element_line(linewidth = 0.1, colour = 'gray84'),
        panel.background = element_blank(),
        panel.border = element_rect(linewidth = 0.2, colour = 'black')) -> p_fltrd_QC_events_by_type
p_fltrd_QC_events_by_type

One can see that the number DSEs drops and that again only the mrgd super samples, show more events for very low coverage events (VLOW). However for high coverage events (SOK) the number of DSEs doesn’t drastically change if using 3 replicates comparisons (uniq) or the merged comparison (mrgd). Interesting to see the gain in high quality IR with the TAF2 OE specifically. Also there are basically no microexon events differentially spliced.

Code
ggsave(filename = '02_QC_AS_Events_No_N_by_Type_Filtered_by_Coverage_Barplot.pdf', 
       plot = p_fltrd_QC_events_by_type, device = cairo_pdf, 
       path = plot_dir, units = 'cm', width = 30, height = 10)

5.1.2 Shared and Specific DSEs

How many of these events overlap between the different comparisons?

Make a list with all the differentially spliced events from each comparison, again filtering out the N events.

Code
# get the filtered events
lapply( list_compares, function(x){
  subset(x, as.integer(Quality_Score_Value) >= 2) |>
    select(GENE, EVENT, comparison) |> unique()
}) -> diff_events_li

# find events that are mis-spliced in all 4 comparisons
all_common_DSE_ids <- Reduce(intersect, lapply(diff_events_li, function(x) x$EVENT ) )

# get comparisons names
comparisons <- lapply( list_compares, function(x){ pull(x, comparison) |> unique() }) |> unlist()

# get events in each comparison
events_li <- lapply( list_compares, function(x){ 
  subset(x, as.integer(Quality_Score_Value) >= 2) |>
    pull(EVENT) |> unique() 
}) 

names(events_li) <- comparisons
lapply(events_li, head, 3)
$NLSTAF2dIDR_mrgd
[1] "HsaEX1037317" "HsaEX0014558" "HsaEX0036972"

$NLSTAF2dIDR_uniq
[1] "HsaEX0004676" "HsaEX0021749" "HsaEX0026581"

$TAF2OE_mrgd
[1] "HsaEX0044934" "HsaEX0003358" "HsaEX0031099"

$TAF2OE_uniq
[1] "HsaEX0003358" "HsaEX0053616" "HsaEX0045901"

To show how many AS events are differentially spliced in each condition taking into consideration the replicates or the super sample, calculate the intersections between all 4 comparisons.

Code
intersection_cmpr_plt <- upset(fromList(events_li), nsets = length(comparisons),
                               text.scale = 1.5, nintersects = NA, 
                               mb.ratio = c(0.65, 0.35), decreasing = c(T, F),
                               order.by = 'freq',
                               main.bar.color = "gray16") 

Show the overlap as an upset plot:

Code
intersection_cmpr_plt

The plot above shows all possible intersections between the 4 groups.

Because the super samples (defined with _mrgd) are the sum of all the 3 individual replicates, they have more coverage and can better quantify exon-exon-junction reads. This explains the increased in number of DSEs. TAF2 OE and NLS-TAF2∆IDR super samples have more or less the same number of unique DSE, 1045 and 1008 DSEs respectively. Of these 311 DSEs are shared between the 2 over-expressions in the super sample comparison.

There are 66 DSEs shared between the TAF2 OE super samples and the comparison done only with 3 replicates and 66 DSEs shared between NLS-TAF2OE∆IDR super samples and the 3 replicates of this condition.

Save the upset plot to pdf.

Code
cairo_pdf(file = file.path(plot_dir, '03_overlap_fltrd_compare_DSE_Upset.pdf'), 
          width = 5.5, height = 5, bg = NA)
intersection_cmpr_plt
dev.off()
png 
  2 

Separate the parts of the dataframe with the uniq and mrgd comparisons, this is going to be useful when looking at the ∆PSI. Also rename the names of the comparisons.

Code
cmpr_events_uniq <- subset(fltrd_cmpr_events, 
                           comparison %in% c('TAF2OE_uniq', 'NLSTAF2dIDR_uniq')) |>
  mutate(comparison = case_when(comparison == 'TAF2OE_uniq' ~ 'TAF2',
                                comparison == 'NLSTAF2dIDR_uniq' ~ 'TAF2dIDR'))
Code
cmpr_events_mrgd <- subset(fltrd_cmpr_events, 
                           comparison %in% c('TAF2OE_mrgd', 'NLSTAF2dIDR_mrgd')) |>
  mutate(comparison = case_when(comparison == 'TAF2OE_mrgd' ~ 'TAF2',
                                comparison == 'NLSTAF2dIDR_mrgd' ~ 'TAF2dIDR'))

Check the overall quality / confidence of the uniq and mrgd events.

Code
plot_quality_score_stacked(cmpr_events_uniq) +
  labs(title = 'UNIQ comparison') -> p_cmpr_scores_bars_uniq
Warning: A numeric `legend.position` argument in `theme()` was deprecated in ggplot2
3.5.0.
ℹ Please use the `legend.position.inside` argument of `theme()` instead.
Code
plot_quality_score_stacked(cmpr_events_mrgd) +
  labs(title = 'MRGD comparison') -> p_cmpr_scores_bars_mrgd

p_cmrp_scores_bar <- p_cmpr_scores_bars_uniq / p_cmpr_scores_bars_mrgd
p_cmrp_scores_bar

Besides the different number of events the event-types trend is basically the same, except maybe for intron retentions. On the one hand, this reinsurances me that the super sample is a good representation of the replicates with just more coverage. Save bar plot.

Code
ggsave(filename = '04_Compare_DSE_Score_Bar.pdf', plot = p_cmrp_scores_bar, 
       device = cairo_pdf, path = plot_dir, units = 'cm', width = 7,
       height = 6.5)

Check the distribution of ∆PSI using a histogram with binwidth of 1. The top part are all differentially spliced exons (S, C1, C2, and C3), while the bottom part are the intron retentions (IR).

Code
plot_dPSI_hist_grpd(data = cmpr_events_uniq) +
  labs(title = 'UNIQ comparison') -> p_cmpr_dPSI_hist_uniq

plot_dPSI_hist_grpd(data = cmpr_events_mrgd) +
  labs(title = 'MRGD comparison') -> p_cmpr_dPSI_hist_mrgd

hist_list <- list(p_cmpr_dPSI_hist_uniq, p_cmpr_dPSI_hist_mrgd)
p_cmpr_dPSI_hist <- wrap_plots(hist_list, byrow = F, heights = c(0.5, 1) )
p_cmpr_dPSI_hist

The breath of the ∆PSI is not massive with the uniq comparison. The ∆PSI magnitudes are however somewhat comparable between the 2 OE (TAF2 on the left, NLS-TAF2∆IDR on the right).

Save histogram plot.

Code
ggsave(filename = '05_compare_DSE_dPSI_Hist.pdf', plot = p_cmpr_dPSI_hist, 
       device = cairo_pdf, path = plot_dir, units = 'cm', width = 10,
       height = 8)

Explore some specific events

In the boxplot filled circle points indicate individual replicates PSI value, while the square points indicate the merged super-sample between the 3 replicates.

Code
both_TAF2OE_ids <- events_li$TAF2OE_mrgd[(events_li$TAF2OE_mrgd %in% events_li$TAF2OE_uniq)]
message(length(both_TAF2OE_ids), ' DSEs in TAF2 OE mrgd and uniq')
66 DSEs in TAF2 OE mrgd and uniq
Code
both_TAF2dIDR_ids <- events_li$NLSTAF2dIDR_mrgd[(events_li$NLSTAF2dIDR_mrgd %in% events_li$NLSTAF2dIDR_uniq)]
message(length(both_TAF2dIDR_ids), ' DSEs in NLS-TAF2 ∆IDR OE mrgd and uniq')
66 DSEs in NLS-TAF2 ∆IDR OE mrgd and uniq
Code
total_both_both_ids <- unique(c(both_TAF2OE_ids, both_TAF2dIDR_ids))
message("Which sum up to ", length(total_both_both_ids), ' DSEs that are mis-spliced in one or both 2 OE and are shared between mrgd and uniq comparisons')
Which sum up to 118 DSEs that are mis-spliced in one or both 2 OE and are shared between mrgd and uniq comparisons
Code
cmpr_events_mrgd |>
  subset(EVENT %in% total_both_both_ids ) |>
  subset( abs(dPSI) >= 26 ) |>
  select(GENE, EVENT, Sample, PSI, dPSI, comparison) |>
  arrange(desc(dPSI)) |>
  pull(EVENT) |> unique() -> top_both_both_ids

# most drastic examples
subset(cmpr_events, EVENT %in% top_both_both_ids) |>
  plot_PSI_boxplot(num_col = 5) +
  theme(legend.position = c(0.9, 0.19)) 

This plot show some interesting cases. For example, the gene ATG2B has an alternative 3’ acceptor splice site HsaALTA1004608-3/4 that is clearly an up-regulation only when NLS-TAF2∆IDR is over expressed. The same trend is observable for HsaALTD0000934-2/3 in the TMEM230 gene.

Code
ggsave(filename = '06_compare_DSE_PSI_boxplot.pdf', plot = last_plot(), 
       device = cairo_pdf, path = plot_dir, units = 'cm', width = 26,
       height = 14)

Check events that are differentially spliced in the super sample but not in the 3-replicate comparison.

Code
mrgd_not_uniq_TAF2OE_ids <- events_li$TAF2OE_mrgd[!(events_li$TAF2OE_mrgd %in% events_li$TAF2OE_uniq)]
message(length(mrgd_not_uniq_TAF2OE_ids), ' DSEs in TAF2 OE mrgd but not in TAF2 OE uniq')
979 DSEs in TAF2 OE mrgd but not in TAF2 OE uniq
Code
mrgd_not_uniq_TAF2dIDR_ids <- events_li$NLSTAF2dIDR_mrgd[!(events_li$NLSTAF2dIDR_mrgd %in% events_li$NLSTAF2dIDR_uniq)]
message(length(mrgd_not_uniq_TAF2dIDR_ids), ' DSEs in NLS-TAF2∆IDR OE mrgd but not in NLS-TAF2∆IDR OE uniq')
942 DSEs in NLS-TAF2∆IDR OE mrgd but not in NLS-TAF2∆IDR OE uniq
Code
total_mrgd_only_both_ids <- unique(c(mrgd_not_uniq_TAF2OE_ids, mrgd_not_uniq_TAF2dIDR_ids))
message("Which sum up to ", length(total_mrgd_only_both_ids), ' DSEs that are mis-spliced in one or both 2 OE in the mrgd super sample comparion only')
Which sum up to 1633 DSEs that are mis-spliced in one or both 2 OE in the mrgd super sample comparion only
Code
cmpr_events_mrgd |>
  subset(EVENT %in% total_mrgd_only_both_ids) |>
  subset( abs(dPSI) >= 50 ) |>
  select(GENE, EVENT, Sample, PSI, dPSI, comparison) |>
  arrange(desc(dPSI)) |>
  pull(EVENT) |> unique() -> top_mrgd_not_uniq_both_ids
  
subset(cmpr_events, EVENT %in% top_mrgd_not_uniq_both_ids) |>
  plot_PSI_boxplot(num_col = 5) +
  theme(legend.position = 'none') 

For these events the individual replicates have a very low coverage (N), however the super sample has enough reads to quantify the PSI more reliably.

The gene SPTLC3 has 5 differentially spliced exons specifically only in the TAF2 OE. These are: - HsaEX6077214 - HsaEX6077218 - HsaEX6077217 - HsaEX0061607 - HsaEX0061606

After a closer inspection in vastDB and the genome browser these are very complex cassette exon events that should be further validate experimentally if needed. They all have a similar trend and might be either ORF disrupting or having an uncertain impact.

A more interesting case could be the alternative acceptor HsaALTA1003439-1/2 of exon 94 of the SYNE2 gene which seems to be always skipped in basal condition and becomes a used splice site only upon OE.

Code
ggsave(filename = '07_compare_DSE_PSI_boxplot.pdf', plot = last_plot(), 
       device = cairo_pdf, path = plot_dir, units = 'cm', width = 26,
       height = 14)

Check one specific event in the gene TAF2.

Code
subset(cmpr_events, EVENT %in% 'HsaEX0063481') |>
  plot_PSI_boxplot(num_col = 4) +
  theme(legend.position = 'none') 

Code
ggsave(filename = '08_compare_TAF2_exon_PSI_boxplot.pdf', plot = last_plot(), 
       device = cairo_pdf, path = plot_dir, units = 'cm', width = 8,
       height = 7)

The TAF2 gene has a cryptic alternative exon HsaEX0063481, here the PSI goes to zero when TAF2 is OE. I believe this is due to the fact that the cDNA used for the over-expression excluded this exonic sequence and the TAF2∆IDR is not containing this region encoding the cryptic exon, hence the PSI is similar to the control.

Select events that are differentially spliced using the comparison method

Given the fact that the replicates are not very deeply sequenced what I’m doing is to create a ‘bona fide’ set of events that are differentially spliced in both the super sample comparison and the comparison with the 3 replicates. In addition, I create a more relaxed set of all events that are mis-spliced (|∆PSI| >= 15) in any of the 2 comparisons. Each set is further divided into shared events between the 2 OE and events specific to each OE.

Code
bona_fide_TAF2_ids <- intersect(events_li$TAF2OE_mrgd, events_li$TAF2OE_uniq)
bona_fide_TAF2dIDR_ids <- intersect(events_li$NLSTAF2dIDR_mrgd, events_li$NLSTAF2dIDR_uniq)
shared_bona_fide_ids <- intersect(bona_fide_TAF2_ids, bona_fide_TAF2dIDR_ids)
specific_bona_fide_TAF2_ids <- bona_fide_TAF2_ids[!bona_fide_TAF2_ids %in% shared_bona_fide_ids]
specific_bona_fide_TAF2dIDR_ids <- bona_fide_TAF2dIDR_ids[!bona_fide_TAF2dIDR_ids %in% shared_bona_fide_ids]

cmpr_TAF2_all_ids <- c(events_li$TAF2OE_mrgd, events_li$TAF2OE_uniq)
cmpr_TAF2dIDR_all_ids <- c(events_li$NLSTAF2dIDR_mrgd, events_li$NLSTAF2dIDR_uniq)

shared_cmpr_all_ids <-  intersect(cmpr_TAF2_all_ids, cmpr_TAF2dIDR_all_ids)
specific_cmpr_TAF2_ids <- cmpr_TAF2_all_ids[!cmpr_TAF2_all_ids %in% shared_cmpr_all_ids]
specific_cmpr_TAF2dIDR_ids <- cmpr_TAF2dIDR_all_ids[!cmpr_TAF2dIDR_all_ids %in% shared_cmpr_all_ids]

any_cmpr_both_oe_ids <- unique(c(bona_fide_TAF2_ids, bona_fide_TAF2dIDR_ids, 
                                 cmpr_TAF2_all_ids, cmpr_TAF2dIDR_all_ids))
message('compare method found ', length(any_cmpr_both_oe_ids), ' DSEs in total')
compare method found 1759 DSEs in total

For now it looks like there are some changes in AS and most events are impacted by both OE.

Separate between the exons and the introns that are up or down spliced in TAF2 OE

Code
cmpr_events_mrgd |>
  subset(EVENT %in% mrgd_not_uniq_TAF2OE_ids) |>
  select(GENE, EVENT, Sample, PSI, dPSI, comparison) |>
  subset(comparison == 'TAF2') |>
  arrange(desc(dPSI)) -> cmpr_TAF2

up_cmpr_TAF2 <- subset(cmpr_TAF2, dPSI >= 0)
do_cmpr_TAF2 <- subset(cmpr_TAF2, dPSI <= 0)

EX_up_cmpr_TAF2 <- grep(pattern = "^HsaEX", x = up_cmpr_TAF2$EVENT, value = T)
EX_do_cmpr_TAF2 <- grep(pattern = "^HsaEX", x = do_cmpr_TAF2$EVENT, value = T)

IN_up_cmpr_TAF2 <- grep(pattern = "^HsaINT", x = up_cmpr_TAF2$EVENT, value = T)
IN_do_cmpr_TAF2 <- grep(pattern = "^HsaINT", x = do_cmpr_TAF2$EVENT, value = T)

Separate between up and down regulated alternative 3’ and 5’ splice sites TAF2 OE

Code
A3_up_cmpr_TAF2 <- grep(pattern = "^HsaALTA", x = up_cmpr_TAF2$EVENT, value = T)
A3_do_cmpr_TAF2 <- grep(pattern = "^HsaALTA", x = do_cmpr_TAF2$EVENT, value = T)

A5_up_cmpr_TAF2 <- grep(pattern = "^HsaALTD", x = up_cmpr_TAF2$EVENT, value = T)
A5_do_cmpr_TAF2 <- grep(pattern = "^HsaALTD", x = do_cmpr_TAF2$EVENT, value = T)

A3_cmpr_TAF2 <- match_alt35_id(up_ids = A3_up_cmpr_TAF2, do_ids = A3_do_cmpr_TAF2)
A5_cmpr_TAF2 <- match_alt35_id(up_ids = A5_up_cmpr_TAF2, do_ids = A5_do_cmpr_TAF2)

Separate between the exons and the introns that are up or down spliced in TAF2∆IDR OE

Code
cmpr_events_mrgd |>
  subset(EVENT %in% mrgd_not_uniq_TAF2dIDR_ids) |>
  select(GENE, EVENT, Sample, PSI, dPSI, comparison) |>
  subset(comparison == 'TAF2dIDR') |>
  arrange(desc(dPSI)) -> cmpr_TAF2dIDR

up_cmpr_TAF2dIDR <- subset(cmpr_TAF2dIDR, dPSI >= 0)
do_cmpr_TAF2dIDR <- subset(cmpr_TAF2dIDR, dPSI <= 0)

EX_up_cmpr_TAF2dIDR <- grep(pattern = "^HsaEX", x = up_cmpr_TAF2dIDR$EVENT, value = T)
EX_do_cmpr_TAF2dIDR <- grep(pattern = "^HsaEX", x = do_cmpr_TAF2dIDR$EVENT, value = T)

IN_up_cmpr_TAF2dIDR <- grep(pattern = "^HsaINT", x = up_cmpr_TAF2dIDR$EVENT, value = T)
IN_do_cmpr_TAF2dIDR <- grep(pattern = "^HsaINT", x = do_cmpr_TAF2dIDR$EVENT, value = T)

Separate between up and down regulated alternative 3’ and 5’ splice sites TAF2∆IDR OE

Code
A3_up_cmpr_TAF2dIDR <- grep(pattern = "^HsaALTA", x = up_cmpr_TAF2dIDR$EVENT, value = T)
A3_do_cmpr_TAF2dIDR <- grep(pattern = "^HsaALTA", x = do_cmpr_TAF2dIDR$EVENT, value = T)

A5_up_cmpr_TAF2dIDR <- grep(pattern = "^HsaALTD", x = up_cmpr_TAF2dIDR$EVENT, value = T)
A5_do_cmpr_TAF2dIDR <- grep(pattern = "^HsaALTD", x = do_cmpr_TAF2dIDR$EVENT, value = T)

A3_cmpr_TAF2dIDR <- match_alt35_id(up_ids = A3_up_cmpr_TAF2dIDR, do_ids = A3_do_cmpr_TAF2dIDR)
A5_cmpr_TAF2dIDR <- match_alt35_id(up_ids = A5_up_cmpr_TAF2dIDR, do_ids = A5_do_cmpr_TAF2dIDR)

5.2 Frequentist inference method (from tidy)

Start by importing the tidy dataset. Here “tidy” means a subset of all identified AS events that have generally a good score (i.e. coverage) and balanced read counts on both splice sites.

Code
tidy_events <- read_vst_tbl(path = tidy_psi_path, show_col_types = FALSE) |>
  mutate(GENE = str_remove(pattern = '=Hsa.*$', string = EVENT) ) |>
  mutate(EVENT = str_remove(pattern = '^*.*=', string = EVENT) ) |>
  relocate(GENE, .before = EVENT) |>
  pivot_longer(cols = starts_with('HeLA'), names_to = 'Sample', values_to = 'PSI') 

message('Tidy events: ', length(unique(tidy_events$EVENT)))
Tidy events: 14657
Code
tidy_events_backup <- tidy_events

5.2.1 PCA

First let’s use these events to check the dispersion of the samples with a PCA. First turn the PSI dataframe into a matrix

Code
tidy_events |> 
  select(GENE, EVENT, Sample, PSI) |>
  pivot_wider(id_cols = EVENT, names_from = Sample, values_from = PSI) |>
  column_to_rownames('EVENT') |>
  as.matrix() -> tidy_mat

Plot the PCA using the function showme_PCA2D form my R package and save it.

Code
showme_PCA2D(log2(tidy_mat), mt = mtdf, mcol = 'Sample', real_aspect_ratio = T,
             m_fill = 'OE', m_label = 'PrettySample', n_top_var = 1000) +
    scale_fill_manual(values = c('gray', '#FFCC33', '#009900'))

Code
ggsave(filename = '09_tidy_PCA.pdf', plot = last_plot(), 
       device = cairo_pdf, path = plot_dir, units = 'cm', width = 20,
       height = 15)

Samples group well between each other. Principal component one separates based on the OE, while component two separates more the TAF2 ∆IDR from the other 2. The next plot shows how much variance each component explains.

Code
showme_PCA2D(log2(tidy_mat), mt = mtdf, mcol = 'Sample', show_variance = T, 
             real_aspect_ratio = F, m_fill = 'OE', m_label = 'PrettySample', 
             n_top_var = 1000) +
    scale_fill_manual(values = c('gray', '#FFCC33', '#009900'))

Check which events are the most prominent on component 1.

Code
showme_PCA2D(log2(tidy_mat), mt = mtdf, mcol = 'Sample', n_loadings = 10,
             n_top_var = 1000) 

Save all 1000 loadings of all components in a table

Code
showme_PCA2D(log2(tidy_mat), mt = mtdf, mcol = 'Sample', n_loadings = 10,
             return_data = T, n_top_var = 1000) -> tidy_loadings

5.2.2 Heatmaps

Explore the samples with a heatmap of the AS events. First let’s check how many events are missing the PSI (e.g. not enough coverage). In the next heatmap white represent NA values and black represent a quantification (from 0 to 100). Of course the merged super samples have more coverage. I don’t think there’s a specific patter and I’d say that the values are missing at random.

Code
cairo_pdf(file = file.path(plot_dir, '10_tidy_missing_values_heatmap.pdf'), 
          width = 5.5, height = 8, bg = NA)
pheatmap(tidy_mat, na_col = 'white', cluster_rows = F, cluster_cols = T,
         legend = F, show_rownames = F, color = 'black', 
         main = paste0( nrow(tidy_mat), ' AS events' ) )

Code
dev.off()
cairo_pdf 
        3 

Since the the super samples have more coverage let’s plot only the AS events in those 3 columns. NA are again displayed in white.

Code
mat <- tidy_mat[, c('HeLa_FRT', 'HeLa_TAF2', 'HeLa_NLSTAF2dIDR')]
colnames(mat) <- gsub('HeLa_', '',  colnames(mat))

pheatmap(mat = mat, na_col = 'white', cluster_rows = F, cluster_cols = T,
         main = paste0( nrow(mat), ' AS events'), angle_col = 0,
         show_rownames = F, color = viridis(n = 100), cutree_cols = 3)

Code
rm(mat)

The control sample separates from the other 2 OE. Now with the replicates only. Since the replicates have some NAs let’s try to select only the events with at least coverage in 5 out of all 9 samples.

Code
tidy_events |> 
  left_join(y = mtdf, by = join_by(Sample) ) |>
  subset(Replicate != 'Merge' ) |>
  group_by(EVENT) |>
  mutate(samples_w_NA = sum(is.na(PSI))) |>
  relocate(samples_w_NA, .after = PSI) |>
  subset(samples_w_NA <= 4) |>
  select(GENE, EVENT, Sample, PSI) |>
  mutate(Sample = gsub('HeLa_', '', Sample)) |>
  pivot_wider(id_cols = EVENT, names_from = Sample, values_from = PSI) |>
  column_to_rownames('EVENT') |>
  as.matrix() -> tidy_fltrd_mat

cairo_pdf(file = file.path(plot_dir, '11_tidy_all_samples_at_least_5_coverage_heatmap.pdf'), 
          width = 5.5, height = 8, bg = NA)
pheatmap(tidy_fltrd_mat, na_col = 'white', cluster_rows = T, cluster_cols = T, 
         show_rownames = F, color = viridis(n = 100), treeheight_col = 25,
         angle_col = 315, main = paste0(nrow(tidy_fltrd_mat), ' AS event') )

Code
dev.off()
cairo_pdf 
        3 

The controls cluster well together but between the TAF2 and NLS-TAF2 overexpression the samples are all pretty similar. Now plot only the events that have full coverage in across all 9 samples.

Code
tidy_events |> 
  left_join(y = mtdf, by = join_by(Sample) ) |>
  subset(Replicate != 'Merge' ) |>
  group_by(EVENT) |>
  mutate(samples_w_NA = sum(is.na(PSI))) |>
  relocate(samples_w_NA, .after = PSI) |>
  subset(samples_w_NA <= 0) |>
  select(GENE, EVENT, Sample, PSI) |>
  mutate(Sample = gsub('HeLa_', '', Sample)) |>
  pivot_wider(id_cols = EVENT, names_from = Sample, values_from = PSI) |>
  column_to_rownames('EVENT') |>
  as.matrix() -> tidy_full_mat

pheatmap(tidy_full_mat, cluster_rows = T, cluster_cols = T, treeheight_col = 25,
         show_rownames = F, angle_col = 315, color = viridis(n = 100),
         main = paste0(nrow(tidy_full_mat), ' AS event with full coverage') )

Check how events change between conditions, to do this I calculate the z-score to scale the PSI of each event between the the super samples. This calculation is done for each EVENT (a single row) individually. I made a function to do this called zScore but it would be as using the native R function scale like this: t(scale(t(mat) where the matrix transpositions are used to apply the zscaling on each rows rather then on the columns.

Code
tidy_events |> 
  left_join(y = mtdf, by = join_by(Sample) ) |>
  subset(Replicate == 'Merge' ) |>
  group_by(EVENT) |>
  mutate(samples_w_NA = sum(is.na(PSI))) |>
  relocate(samples_w_NA, .after = PSI) |>
  subset(samples_w_NA <= 0) |>
  select(GENE, EVENT, Sample, PSI) |>
  mutate(Sample = gsub('HeLa_', '', Sample)) |>
  pivot_wider(id_cols = EVENT, names_from = Sample, values_from = PSI) |>
  column_to_rownames('EVENT') |>
  as.matrix() -> tidy_full_mat

set.seed(16)
pheatmap(zScore(tidy_full_mat), cluster_rows = T, cluster_cols = T, 
         angle_col = 0, treeheight_col = 25, show_rownames = F, 
         main = paste0(nrow(tidy_full_mat), ' AS event with full coverage'),
         color = met.brewer("Benedictus", n = 100, type = 'continuous', direction = -1) )

The effect of the change (i.e. PSI between different conditions) is colour coded by the z-score. The heatmap shows that overall the 2 OEs do not have a very similar effect. Here one has to consider that the PSI is a bounded value from zero to 100. So even small PSI difference lead to a zscore different of ~ 1. I also tried another method based on Median Absolute Deviation (MAD) and obtained values that were not really scaled, so for visualisation purposes fix to 5 max and and -5 min.

Code
ztidy_mad <- zMad(tidy_full_mat)
# set maximum to 10 -10
ztidy_mad[ztidy_mad > 5 ] <- 5
ztidy_mad[ztidy_mad < -5] <- -5
pheatmap(ztidy_mad, cluster_rows = T, cluster_cols = T, 
         angle_col = 0, treeheight_col = 25, show_rownames = F, 
         main = paste0(nrow(tidy_full_mat), ' AS event with full coverage'),
         color = met.brewer("Benedictus", n = 100, type = 'continuous', direction = -1) )

This doesn’t really help.

For sake of completeness I apply the logit transformation to the PSI, that tends to scale them a bit better as they are not bounded between 0 and 100 any more by between -infinity and +infinity. So the events that are either 0 or 100 in one case need to be removed.

Code
logit_tidy_full_mat <- log10(tidy_full_mat / (100-tidy_full_mat))

# select only finite events (PSI = 0, and 100 are removed in the logit transformation)
finite_logit_tidy_full_mat <- logit_tidy_full_mat[is.finite(rowSums(logit_tidy_full_mat)), ]

set.seed(16)
pheatmap(zScore(finite_logit_tidy_full_mat), cluster_rows = T, cluster_cols = T,
         angle_col = 0, treeheight_col = 25, show_rownames = F, 
         main = paste0(nrow(finite_logit_tidy_full_mat), ' AS event with full coverage'),
         color = met.brewer("Benedictus", n = 100, type = 'continuous', direction = -1))

As an example, extract the events with the biggest difference between all samples (using the z-score scaling).

Code
zScore(tidy_full_mat) |>
  as.data.frame() |>
  rownames_to_column('EVENT') |>
  arrange(desc(FRT)) |> (\(x) {
    rbind( head(x, 5), tail(x, 5) )
    })()  |>
  pull(EVENT) -> top_bottom_events_logit

Check their PSI distribution across samples. Since the tidy table does not contain the quality scores and the PSI of these events in not present in the compare table I get the PSI from the full inclusion table.

Code
grep_psi(inclusion_tbl = psi_path, vst_id = top_bottom_events_logit) |>
  tidy_vst_psi() -> top_bottom_psi_logit

top_bottom_psi_logit |> 
  left_join(y = mtdf, by = join_by(Sample) ) |>
  plot_PSI_boxplot(order_by_dPSI = F) +
  theme(legend.position = 'none')

These are events that are not considered by the compare method, because they barely show a ∆PSI >= 15 however some of the events (but not all) in the plot above could be considered differentially spliced. For example, HsaINT0126337 or HsaALTD1053720-1/2 might be worth considering. In the next step of the analysis I’ll try to select such type of events using a more robust statistical method, rather than just ranking the by a scaled value.

5.2.3 Testing for significance

A better way to detect this kind of subtle changes with a more robust statistical framework is to calculate a P value using the individual replicates PSI and plot them against the ∆PSI as a Volcano plot. To achieve this first define the sample types.

Code
mrgd_cntrl <- "HeLa_FRT"
mrgd_TAF2OE <- "HeLa_TAF2"
mrgd_NLSTAF2dIDROE <- "HeLa_NLSTAF2dIDR" 

taf2oe <- c('HeLa_TAF2_A', 'HeLa_TAF2_B', 'HeLa_TAF2_C')
taf2didr <- c('HeLa_NLSTAF2dIDR_A', 'HeLa_NLSTAF2dIDR_B', 'HeLa_NLSTAF2dIDR_C')
cntrls <- c('HeLa_FRT_A', 'HeLa_FRT_B', 'HeLa_FRT_C')

Filter for samples that have finite ∆PSI (meaning that is not just NA) and that have maximum one sample per condition as NA (meaning at least 2 samples are not NA).

Code
tidy_events |>
  group_by(EVENT) |>
  summarise(mPSI_TAF2 = mean(PSI[Sample %in% mrgd_TAF2OE], na.rm = T),
            mPSI_TAF2dIDR = mean(PSI[Sample %in% mrgd_NLSTAF2dIDROE], na.rm = T),
            mPSI_CNTRL = mean(PSI[Sample %in% mrgd_cntrl], na.rm = T),
            Samples_NA_TAF2 = sum(is.na(PSI[Sample %in% taf2oe])),
            Samples_NA_TAF2dIDR = sum(is.na(PSI[Sample %in% taf2didr])),
            Samples_NA_CNTRL = sum(is.na(PSI[Sample %in% cntrls])) ) |>
  mutate(dPSI_TAF2 = mPSI_TAF2 - mPSI_CNTRL,
         dPSI_TAF2dIDR = mPSI_TAF2dIDR - mPSI_CNTRL ) |>
  subset(is.finite(dPSI_TAF2) & is.finite(dPSI_TAF2dIDR) ) |>
  # select samples with max 1 NA PSI value per event
  # i.e. at least 2 samples with a finite PSI
  subset( Samples_NA_TAF2 <= 1) |>
  subset( Samples_NA_TAF2dIDR <= 1) |>
  subset( Samples_NA_CNTRL <= 1) |>
  select(EVENT, dPSI_TAF2, dPSI_TAF2dIDR) |>
  unique() |>
  left_join(y = tidy_events, by = join_by(EVENT)) |>
  relocate(GENE, .before = EVENT) |>
  relocate(starts_with('dPSI'), .after = PSI) |>
  # add metadata information
  left_join(y = mtdf, by = join_by(Sample)) |>
  subset(! Replicate == 'Merge') |>
  group_by(EVENT) |>
  # for the statistical test 
  # select only events that are changing a bit to enter the test
  # if not the data is essentially constant and statistical test fails 
  # this is calculated by measuring the PSI difference standard deviation that should not be zero
  mutate(dSD_TAD2 = sd(PSI[Sample %in% cntrls] - PSI[Sample %in% taf2oe], na.rm = T) ) |>
  mutate(dSD_TAD2dIDR = sd(PSI[Sample %in% cntrls] - PSI[Sample %in% taf2didr], na.rm = T) ) |>
  subset(dSD_TAD2 != 0) |>
  subset(dSD_TAD2dIDR != 0) |>
  select(! c(starts_with("dSD_"), PrettySample, CellType)) |> # remove these columns
  # In addition I could select the events that have a ∆PSI different from zero.
  # subset( abs(dPSI_TAF2) > 0.01 ) |>
  # subset( abs(dPSI_TAF2) >= 0.01 ) |> 
  ungroup() -> tidy_dPSI_df

message(length(unique(tidy_dPSI_df$EVENT)), '/', 
        length(unique(tidy_events$EVENT)), 
        ' filtered events that enter the statistical analysis')
6539/14657 filtered events that enter the statistical analysis
Important

The following analysis uses Student’s t-test on PSI values that are bounded between 0 and 100. Some people might find this “statistically wrong” as the PSI distribution does not follow normality. My argument is that PSI is close enough to normality to allow for the use of a t-test and the test itself is pretty robust to non-normality anyway. This means that I believe I can use the t-test for this data; not that I should.

Now let’s check how close to a normal distribution the PSI values are. I’ll start by using the TAF2 OE PSI and use the same mean and standard deviation of those PSI to generate a normally distributed random PSI.

Code
set.seed(16)
tidy_dPSI_df |> 
  subset( OE != 'NLSTAF2dIDR')  |>
  select(EVENT, PSI, Replicate) |>
  subset(!is.na(PSI)) -> test_taf2_dat
  
test_taf2_dat$PSI_Norm <- rnorm(n = length(test_taf2_dat$PSI),
                                mean = mean(test_taf2_dat$PSI),
                                sd = sd(test_taf2_dat$PSI) )

The first test I do is simply checking the real data and the theoretical normal distributions

Code
test_taf2_dat |>
  pivot_longer(cols = starts_with('PSI')) |>
  ggplot( aes(x = value, fill = name) ) +
  geom_histogram(binwidth = 1) + 
  labs(x = 'PSI') + theme_bw() + 
  theme(legend.position = c(0.8, 0.8),
        legend.title = element_blank())

The real data (in red) is inflated for zeros (full skipping) and 100 (full inclusion) which expected, however the PSI in the middle to resemble a normal distribution (in blue), in fact binning the data into 10 bins shows a more similar overlap

Code
test_taf2_dat |>
  pivot_longer(cols = starts_with('PSI')) |>
  ggplot( aes(x = value, fill = name) ) +
  geom_histogram(binwidth = 10, colour = 'black') +
  labs(x = 'PSI') + theme_bw() + 
  theme(legend.position = c(0.8, 0.8),
        legend.title = element_blank())

Lastly I do a quantile-quantile plot:

Code
test_taf2_dat |>
  ggplot(aes(sample = PSI)) +
    stat_qq() + stat_qq_line() + coord_cartesian(ylim = c(-10, 110)) + theme_bw()

Now that I showed how the PSI is not a perfect normal distribution but it’s close enough in my view, I calculate P-values for TAF2 OE.

Code
tidy_dPSI_df |> 
  subset( OE != 'NLSTAF2dIDR') |>
  group_by(EVENT) |>
  t_test(formula = PSI ~ OE, ref.group = 'FRT', alternative = "two.sided", 
         paired = F, p.adjust.method = 'none', conf.level = 0.95, detailed = F) |>
  adjust_pvalue(method = "BH") |> add_significance("p.adj") |>
  select(EVENT, p, p.adj, p.adj.signif) |>
  setNames(c('EVENT', 'p_TAF2', 'padj_TAF2', 'padj_signif_TAF2')) -> pvals_TAF2

Do the same P-values estimations with the t-test for NLS TAF2 ∆IDR OE

Code
tidy_dPSI_df |> 
  subset( OE != 'TAF2') |>
  group_by(EVENT) |>
  t_test(formula = PSI ~ OE, ref.group = 'FRT', alternative = "two.sided", 
         paired = F, p.adjust.method = 'none', conf.level = 0.95, detailed = F) |>
  adjust_pvalue(method = "BH") |> add_significance("p.adj") |>
  select(EVENT, p, p.adj, p.adj.signif) |>
  setNames(c('EVENT', 'p_TAF2dIDR', 'padj_TAF2dIDR', 'padj_signif_TAF2dIDR')) -> pvals_TAF2dIDR

Combine all dataframes into one. If an event as a P-value that is NA, discard it.

Code
tidy_dPSI_df |> 
  left_join(y = pvals_TAF2, by = join_by(EVENT)) |>
  left_join(y = pvals_TAF2dIDR, by = join_by(EVENT)) |>
  subset(!is.na(p_TAF2)) |>
  subset(!is.na(p_TAF2dIDR)) -> tidy_dPSI_pval

I also check the P-value distribution.

Code
tidy_dPSI_pval |>
  select(EVENT, p_TAF2, p_TAF2dIDR) |>
  pivot_longer(cols = starts_with("p")) |>
  ggplot() +
  aes(x = value, fill = name) +
  geom_histogram(bins = 25, colour = 'black') + 
  labs(x = "P-values") + theme_bw() + theme(legend.position = c(0.85, 0.85),
                                            legend.title = element_blank())

The major drawback of this method is that the multiple hypothesis testing correction (adjust_pvalue(method = "BH")) results in no event being differentially expressed by having an adjusted P-value <= 0.05.

Code
tidy_dPSI_pval |>
  select(EVENT, starts_with("p")) |> select(!PSI) |> unique() |>
  arrange(padj_TAF2)  |>
  head(5)
# A tibble: 5 × 7
  EVENT               p_TAF2 padj_TAF2 padj_signif_TAF2 p_TAF2dIDR padj_TAF2dIDR
  <chr>                <dbl>     <dbl> <chr>                 <dbl>         <dbl>
1 HsaEX0017210       9.41e-5     0.514 ns                   0.0949         0.963
2 HsaEX0029228       2.36e-4     0.514 ns                   0.145          0.963
3 HsaEX0060986       1.81e-4     0.514 ns                   0.331          0.963
4 HsaALTA0007106-4/5 6.52e-4     0.547 ns                   0.346          0.963
5 HsaEX0002674       6.69e-4     0.547 ns                   0.158          0.963
# ℹ 1 more variable: padj_signif_TAF2dIDR <chr>
Code
tidy_dPSI_pval |>
  select(EVENT, starts_with("p")) |> select(!PSI) |> unique() |>
  arrange(padj_TAF2dIDR)  |>
  head(5)
# A tibble: 5 × 7
  EVENT               p_TAF2 padj_TAF2 padj_signif_TAF2 p_TAF2dIDR padj_TAF2dIDR
  <chr>                <dbl>     <dbl> <chr>                 <dbl>         <dbl>
1 HsaALTD0002417-1/5 0.184       0.930 ns                 0.000209         0.683
2 HsaALTD0003442-1/4 0.00122     0.649 ns                 0.000179         0.683
3 HsaALTA0006052-1/2 0.00975     0.930 ns                 0.000911         0.851
4 HsaALTA0006052-2/2 0.00975     0.930 ns                 0.000911         0.851
5 HsaALTD0003442-2/4 0.0129      0.930 ns                 0.000677         0.851
# ℹ 1 more variable: padj_signif_TAF2dIDR <chr>

To control for this I set a non-corrected P-value threshold to 0.01, which is more conservative, plus the size effect threshold is |∆PSI| >= 10.

Code
dPSI_squish <- 40

tidy_dPSI_pval |>
  # label the events that are below significant thresholds
  mutate(DSE_TAF2 = p_TAF2 < pval_thrshld & abs(dPSI_TAF2) >= dPSI_thrshld ) |>
  mutate(DSE_TAF2dIDR = p_TAF2dIDR < pval_thrshld & abs(dPSI_TAF2dIDR) >= dPSI_thrshld ) -> tidy_dPSI_pval

5.2.4 Volcano plots

Volcano plot of TAF2 OE. Points with |∆PSI| > 40 are squished on the sides of the plot.

Code
# select fewer columns for plotting
subset(tidy_dPSI_pval, OE == 'TAF2') |>
  select(GENE, EVENT, dPSI_TAF2, p_TAF2, padj_TAF2, DSE_TAF2) |>
  unique() -> plot_TAF2

ggplot(plot_TAF2) +
  aes(x = dPSI_TAF2, y = -log10(p_TAF2), fill = DSE_TAF2) +
  geom_hline(yintercept = -log10(pval_thrshld), linewidth = 0.1, colour = 'black') +
  geom_vline(xintercept = -dPSI_thrshld, linewidth = 0.1, colour = 'black') +
  geom_vline(xintercept = dPSI_thrshld, linewidth = 0.1, colour = 'black') +
  geom_text_repel(data = subset(plot_TAF2, DSE_TAF2),
                  aes(label = GENE, x = dPSI_TAF2, y = -log10(p_TAF2)),
                      seed = 16, show.legend = F, segment.curvature = -1e-20,
                      family = "Arial", size = 2, nudge_x = -0.25,
                      segment.color = 'black', verbose = F, lwd = 0.1,
                      box.padding = grid::unit(1, "mm"),
                      point.padding = grid::unit(0.55, "mm"),
                      max.overlaps = 20 ) +
  geom_point(shape = 21, show.legend = F, alpha = 1, stroke = 0.2) +
  scale_fill_manual(values = c('gray84', 'firebrick3') ) +
  scale_x_continuous(n.breaks = 10, oob = scales::squish, limits = c(-dPSI_squish, dPSI_squish),
                     expand = expansion(mult = 0, add = 0.5)) +
  scale_y_continuous(expand = expansion(mult = c(0.01, 0.05), add = 0) ) +
  coord_cartesian(ylim = c(0, 4.25)) + # set this to have the 2 volcano plots with the same Y-axis range
  labs( x = "\u0394PSI (TAF2 OE - Cntrl)", y = expression(-log[10] ~ "P-Value") ) +
  theme_classic(base_family = 'Arial') +
  theme(axis.text = element_text(colour = 'black'),
        axis.line = element_line(colour = 'black', linewidth = 0.2),
        panel.grid.major = element_line(linewidth = 0.1),
        panel.grid.minor = element_blank(),
        panel.background = element_blank(),
        plot.background = element_blank())

Code
ggsave(filename = '12_tidy_TAF2_Volcano.pdf', plot = last_plot(), 
       device = cairo_pdf, path = plot_dir, units = 'cm', width = 10,
       height = 12)

Select the differentially spliced events in TAF2 OE and plot only the exon skipping events.

Code
stat_TAF2 <- unique(subset(plot_TAF2, DSE_TAF2)$EVENT)
message(length(stat_TAF2), ' significant events with low ∆PSI in TAF2 OE')
48 significant events with low ∆PSI in TAF2 OE
Code
subset(tidy_dPSI_pval, OE == 'TAF2') |>
  select(GENE, EVENT, dPSI_TAF2, p_TAF2, padj_TAF2, DSE_TAF2) |>
  subset(DSE_TAF2) |>
  unique() -> tidy_signif_TAF

up_tidy_TAF2 <- subset(tidy_signif_TAF, dPSI_TAF2 >= 0)
do_tidy_TAF2 <- subset(tidy_signif_TAF, dPSI_TAF2 < 0)

EX_up_tidy_TAF2 <- grep(pattern = "^HsaEX", x = up_tidy_TAF2$EVENT, value = T)
EX_do_tidy_TAF2 <- grep(pattern = "^HsaEX", x = do_tidy_TAF2$EVENT, value = T)

IN_up_tidy_TAF2 <- grep(pattern = "^HsaINT", x = up_tidy_TAF2$EVENT, value = T)
IN_do_tidy_TAF2 <- grep(pattern = "^HsaINT", x = do_tidy_TAF2$EVENT, value = T)
Code
A3_up_tidy_TAF2 <- grep(pattern = "^HsaALTA", x = up_tidy_TAF2$EVENT, value = T)
A3_do_tidy_TAF2 <- grep(pattern = "^HsaALTA", x = do_tidy_TAF2$EVENT, value = T)

A5_up_tidy_TAF2 <- grep(pattern = "^HsaALTD", x = up_tidy_TAF2$EVENT, value = T)
A5_do_tidy_TAF2 <- grep(pattern = "^HsaALTD", x = do_tidy_TAF2$EVENT, value = T)

Check that the IDs are from the same exon

Code
A3_tidy_TAF2 <- match_alt35_id(up_ids = A3_up_tidy_TAF2, do_ids = A3_do_tidy_TAF2)
A5_tidy_TAF2 <- match_alt35_id(up_ids = A5_up_tidy_TAF2, do_ids = A5_do_tidy_TAF2)

Select some events to plot

Code
set.seed(16)
subset_stat_TAF2 <- grep(pattern = 'HsaEX', x = stat_TAF2, value = T)
subset_stat_TAF2 <- c('HsaEX0065488', sample(stat_TAF2[stat_TAF2 != "HsaEX0065488"], size = 8) )

grep_psi(inclusion_tbl = psi_path, vst_id = subset_stat_TAF2 ) |>
  tidy_vst_psi() -> psi_stat_TAF2

Plot PSI values.

Code
psi_stat_TAF2 |> 
  left_join(y = mtdf, by = join_by(Sample)) |>
  plot_PSI_boxplot(order_by_dPSI = F, num_col = 5)

The event HsaEX0065488 in the gene TMEM143 seems to have a TAF2 OE specific skipping effect. In fact, this event was not found in the compare method due to the default limit of a ∆PSI >=15

Code
"HsaEX0065488" %in% shared_cmpr_all_ids
[1] FALSE

So in this way we don’t have to lower the ∆PSI threshold and risk of obtaining too many false positives, but we can only extract significant events with a low difference in ∆PSI.

Volcano plot of NLS-TAF2 ∆IDR OE. Points with |∆PSI| > 40 are squished on the sides of the plot.

Code
subset(tidy_dPSI_pval, OE == 'NLSTAF2dIDR') |>
  select(GENE, EVENT, dPSI_TAF2dIDR, p_TAF2dIDR, padj_TAF2dIDR, DSE_TAF2dIDR) |>
  unique() -> plot_TAF2dIDR

ggplot(plot_TAF2dIDR) +
  aes(x = dPSI_TAF2dIDR, y = -log10(p_TAF2dIDR), fill = DSE_TAF2dIDR) +
  geom_hline(yintercept = -log10(pval_thrshld), linewidth = 0.1, colour = 'black') +
  geom_vline(xintercept = -dPSI_thrshld, linewidth = 0.1, colour = 'black') +
  geom_vline(xintercept = dPSI_thrshld, linewidth = 0.1, colour = 'black') +
  geom_text_repel(data = subset(plot_TAF2dIDR, DSE_TAF2dIDR), 
                  aes(label = GENE, x = dPSI_TAF2dIDR, y = -log10(p_TAF2dIDR)),
                  seed = 16, show.legend = F, segment.curvature = -1e-20, 
                  family = "Arial", size = 2, nudge_x = -0.25,
                  segment.color = 'black',verbose = F, 
                  box.padding = grid::unit(1, "mm"),
                  point.padding = grid::unit(0.55, "mm"),
                  max.overlaps = 20 ) +
  geom_point(shape = 21, show.legend = F, alpha = 1, stroke = 0.2) +
  scale_fill_manual(values = c('gray84', 'firebrick3') ) +
  scale_x_continuous(n.breaks = 10, oob = scales::squish, limits = c(-dPSI_squish, dPSI_squish),
                     expand = expansion(mult = 0, add = 0.5)) +
  scale_y_continuous(expand = expansion(mult = c(0.01, 0.05), add = 0) ) +
  labs(x = "\u0394PSI (NLS-TAF2 \u0394IDR OE - Cntrl)", y = expression(-log[10] ~ "P-Value")) +
  coord_cartesian(ylim = c(0, 4.25)) + # set this to have the 2 volcano plots with the same Y-axis range
  theme_classic(base_family = 'Arial') +
  theme(axis.text = element_text(colour = 'black'),
        axis.line = element_line(colour = 'black', linewidth = 0.2),
        panel.grid.major = element_line(linewidth = 0.1),
        panel.grid.minor = element_blank(),
        panel.background = element_blank(),
        plot.background = element_blank())

Code
ggsave(filename = '13_tidy_TAF2dIDR_Volcano.pdf', plot = last_plot(), 
       device = cairo_pdf, path = plot_dir, units = 'cm', width = 10,
       height = 12)

Select the differentially spliced events in TAF2 ∆IDR OE and plot some of the events.

Code
stat_TAF2dIDR <- unique(subset(plot_TAF2dIDR, DSE_TAF2dIDR)$EVENT)
message(length(stat_TAF2dIDR), ' significant events with low ∆PSI in TAF2 ∆IDR OE')
31 significant events with low ∆PSI in TAF2 ∆IDR OE
Code
subset(tidy_dPSI_pval, OE == 'NLSTAF2dIDR') |>
  select(GENE, EVENT, dPSI_TAF2dIDR, p_TAF2dIDR, padj_TAF2dIDR, DSE_TAF2dIDR) |>
  subset(DSE_TAF2dIDR) |>
  unique() -> tidy_signif_TAFdIDR

up_tidy_TAF2dIDR <- subset(tidy_signif_TAFdIDR, dPSI_TAF2dIDR >= 0)
do_tidy_TAF2dIDR <- subset(tidy_signif_TAFdIDR, dPSI_TAF2dIDR < 0)

EX_up_tidy_TAF2dIDR <- grep(pattern = "^HsaEX", x = up_tidy_TAF2dIDR$EVENT, value = T)
EX_do_tidy_TAF2dIDR <- grep(pattern = "^HsaEX", x = do_tidy_TAF2dIDR$EVENT, value = T)

IN_up_tidy_TAF2dIDR <- grep(pattern = "^HsaINT", x = up_tidy_TAF2dIDR$EVENT, value = T)
IN_do_tidy_TAF2dIDR <- grep(pattern = "^HsaINT", x = do_tidy_TAF2dIDR$EVENT, value = T)
Code
A3_up_tidy_TAF2dIDR <- grep(pattern = "^HsaALTA", x = up_tidy_TAF2dIDR$EVENT, value = T)
A3_do_tidy_TAF2dIDR <- grep(pattern = "^HsaALTA", x = do_tidy_TAF2dIDR$EVENT, value = T)

A5_up_tidy_TAF2dIDR <- grep(pattern = "^HsaALTD", x = up_tidy_TAF2dIDR$EVENT, value = T)
A5_do_tidy_TAF2dIDR <- grep(pattern = "^HsaALTD", x = do_tidy_TAF2dIDR$EVENT, value = T)

Check that the IDs are from the same exon

Code
A3_tidy_TAF2dIDR <- match_alt35_id(up_ids = A3_up_tidy_TAF2dIDR, do_ids = A3_do_tidy_TAF2dIDR)
A5_tidy_TAF2dIDR <- match_alt35_id(up_ids = A5_up_tidy_TAF2dIDR, do_ids = A5_do_tidy_TAF2dIDR)

Select some events to plot

Code
set.seed(16)
# subset_stat_TAF2dIDR <- grep(pattern = '^HsaEX', x = stat_TAF2dIDR, value = T)
if( length(stat_TAF2dIDR) >= 9 ){
  subset_stat_TAF2dIDR <- sample(stat_TAF2dIDR, size = 9)
}

Get events

Code
grep_psi(inclusion_tbl = psi_path, vst_id = subset_stat_TAF2dIDR ) |>
  tidy_vst_psi() -> psi_stat_TAF2dIDR
Code
psi_stat_TAF2dIDR |> 
  left_join(y = mtdf, by = join_by(Sample) ) |>
  plot_PSI_boxplot(order_by_dPSI = F, num_col = 5)

The exon skipping event HsaEX0029450 is in the HDAC8 gene seems quite interesting as it is an ORF preserving event. Also the event HsaEX0058132 is in the SHANK associated RH domain interactor (SHARPIN) seems quite interesting.

The event HsaEX0050973 was characterised before in HeLa cells, where the inclusion isoform leads to multi-nucleated cells when over-expressed. This case was also found with the compare method.

Code
"HsaEX0050973" %in% specific_bona_fide_TAF2dIDR_ids
[1] TRUE

With this approach we can include some more AS events to the DSE list.

Code
stat_both_oe_ids <- unique(c(stat_TAF2, stat_TAF2dIDR))
message('tidy method added ', length(stat_both_oe_ids), ' DSEs, of which ',
        length(stat_both_oe_ids[stat_both_oe_ids %in% any_cmpr_both_oe_ids]),
        ' were also found by the compared method')
tidy method added 76 DSEs, of which 19 were also found by the compared method

5.2.5 Appendix with wilcoxon test

For statistical fairness repeat the analysis and check overlap with t-test.

5.3 Bayesian method (from diff)

Import the vast-tools diff tables. Note: diff outputs PSIs from 0 to 1, so I multiply them by 100.

This tables were create with this command line short cut:

Code
tail -n +2 Diff_output_all_events_noB3_pIR.tab | \
      awk '{OFS="\t"} { if($6 >= 0.15) print $1, $2, $5, $6}' | \
      sort -k3,4nr  | sed -e $'1iGENE\tEVENT\tdPSI\tMVdPSIat95' > Fltrd

Which filters for events with a minimum value of |∆PSI| >= 15 with a 95% confidence (column number 6) and sorts them by observed ∆PSI (column number 5).

Code
diff_TAF2 <- read_vst_tbl(path = diff_TAF2_path, show_col_types = FALSE) |>
    mutate(dPSI = dPSI * 100) |>
    mutate(MVdPSIat95 = MVdPSIat95 * 100) |>
    mutate(COMPLEX = case_when( grepl(x = EVENT, pattern = '^HsaEX') ~ 'S',
                                grepl(x = EVENT, pattern = '^HsaINT') ~ 'IR',
                                grepl(x = EVENT, pattern = '^HsaALTD') ~ 'Alt5',
                                grepl(x = EVENT, pattern = '^HsaALTA') ~ 'Alt3') ) |>
    mutate(comparison = 'TAF2')
    
diff_TAF2dIDR <- read_vst_tbl(path = diff_TAF2dIDR_path, show_col_types = FALSE) |>
    mutate(dPSI = dPSI * 100) |>
    mutate(MVdPSIat95 = MVdPSIat95 * 100) |>
    mutate(COMPLEX = case_when( grepl(x = EVENT, pattern = '^HsaEX') ~ 'S',
                                grepl(x = EVENT, pattern = '^HsaINT') ~ 'IR',
                                grepl(x = EVENT, pattern = '^HsaALTD') ~ 'Alt5',
                                grepl(x = EVENT, pattern = '^HsaALTA') ~ 'Alt3') ) |>
    mutate(comparison = 'NLS-TAF2 ∆IDR')

How many AS events?

Code
diff_events_TAF2 <- unique(diff_TAF2$EVENT)
message(length(diff_events_TAF2), ' diff events TAF2 OE')
130 diff events TAF2 OE
Code
diff_events_TAF2dIDR <- unique(diff_TAF2dIDR$EVENT)
message(length(diff_events_TAF2dIDR), ' diff events TAF2 ∆IDR OE')
158 diff events TAF2 ∆IDR OE
Code
diff_events_common <- intersect(diff_events_TAF2, diff_events_TAF2dIDR)
message(length(diff_events_common), ' diff events common between both OE')
27 diff events common between both OE
Code
# make a list with the DSE of each OE
diff_events_TAF2_only <- diff_events_TAF2[(! diff_events_TAF2 %in% diff_events_common)]
diff_events_TAF2dIDR_only <- diff_events_TAF2dIDR[(! diff_events_TAF2dIDR %in% diff_events_common)]

Interestingly not many events are common between the 2 OE.

Plot the actual ∆PSI of the events that were filtered to have a minimum value of |∆PSI| >=15 with a 95% confidence.

Code
plot_dPSI_hist_grpd(diff_TAF2) +
  geom_vline(xintercept = -15) + geom_vline(xintercept = 15)

Code
ggsave(filename = '14_diff_dPSI_TAF2_Histogram.pdf', plot = last_plot(), 
       device = cairo_pdf, path = plot_dir, units = 'cm', width = 12,
       height = 6)

There seem to be a good balance between up and down regulated events.

Code
plot_dPSI_hist_grpd(diff_TAF2dIDR) +
  geom_vline(xintercept = -15) +   geom_vline(xintercept = 15) 

Code
ggsave(filename = '15_diff_dPSI_TAF2dIDR_Histogram.pdf', plot = last_plot(), 
       device = cairo_pdf, path = plot_dir, units = 'cm', width = 12,
       height = 6)

These seems to be more down regulated AS events in TAF2 ∆IDR OE.

The diff table does not contain all the information about the AS event as the main full inclusion table so I get such info from the full inclusion table.

Code
fetch_diff_IDs <- unique(c(diff_events_TAF2, diff_events_TAF2dIDR))
message(length(fetch_diff_IDs), ' events to get info... this may take some time')
261 events to get info... this may take some time

This next step might take some time to process.

Code
grep_psi(inclusion_tbl = psi_path, vst_id = fetch_diff_IDs) |>
  tidy_vst_psi() -> diff_psi_df

To avoid having to repeat this step in the future I save this dataframe to file and not execute the code chunk above any more.

Code
diff_psi_df <- diff_psi_df |> 
  mutate(comparison = case_when(EVENT %in% diff_events_TAF2_only ~ 'TAF2',
                                EVENT %in% diff_events_TAF2dIDR_only ~ 'TAF2dIDR',
                                EVENT %in% diff_events_common ~ 'BOTH' ) )

write_delim(x = diff_psi_df, file = file.path(diff_dir, 'Fltrd_Annotated_Diff_Events_HeLa_Both_OE.tab'),
            delim = '\t') 

Import the table that was written the first time.

Code
diff_psi_df <- read_vst_tbl(path = file.path(diff_dir, 'Fltrd_Annotated_Diff_Events_HeLa_Both_OE.tab'),
                            show_col_types = FALSE) |>
  left_join(y = mtdf, by = join_by(Sample))

Add the ∆PSI and Max ∆PSI at 95% confidence information for each comparison

Code
# discard TYPE and comparison column
diff_TAF2 <- diff_TAF2 |> select(!c('COMPLEX', 'comparison'))
diff_TAF2dIDR <- diff_TAF2dIDR |> select(!c('COMPLEX', 'comparison'))

colnames(diff_TAF2) <- c('GENE', 'EVENT', 'dPSI_TAF2', 'MVdPSIat95_TAF2')
colnames(diff_TAF2dIDR) <- c('GENE', 'EVENT', 'dPSI_TAF2dIDR', 'MVdPSIat95_TAF2dIDR')

exon_type <- c("S", "C1", "C2", "C3", "ANN", "MIC")
intron_type <- "IR"
alt_ss <- c("Alt3", "Alt5")

left_join(x = diff_psi_df, y = diff_TAF2, by = join_by(GENE, EVENT) ) |>
  left_join(y = diff_TAF2dIDR, by = join_by(GENE, EVENT)) |>
  mutate(method = 'diff') |>
  mutate(signif_in = case_when(EVENT %in% diff_events_TAF2 ~ 'dseTAF2',
                               EVENT %in% diff_events_TAF2dIDR ~ 'dseTAF2dIDR') ) |>
  # group events by directions
  mutate(signif_in = ifelse(EVENT %in% diff_events_common, yes = 'dseBOTH', no = signif_in) ) |>
  mutate(direction_TAF2 = ifelse( dPSI_TAF2 >= 0, yes = 'upTAF2', no = 'downTAF2'),
         direction_TAF2dIDR = ifelse( dPSI_TAF2dIDR >= 0, yes = 'upTAF2dIDT', no = 'downTAF2dIDR') ) |>
  mutate(direction_TAF2 = ifelse(is.na(direction_TAF2), yes = 'NO', no = direction_TAF2 ),
         direction_TAF2dIDR = ifelse(is.na(direction_TAF2dIDR), yes = 'NO', no = direction_TAF2dIDR ) ) |>
  mutate(GROUP = paste0(direction_TAF2, '_', direction_TAF2dIDR) ) |>
  mutate(TYPE = case_when(COMPLEX %in% exon_type ~ 'Exon',
                          COMPLEX %in% intron_type ~ 'Intron',
                          COMPLEX %in% alt_ss ~ 'Alt. Splice Site') ) |>
  ungroup() -> processed_tbl_diff

# count number of events
processed_tbl_diff |>
  select(GENE, EVENT, GROUP, TYPE, signif_in) |>
  unique() |>
  group_by(GROUP, TYPE, signif_in) |>
  summarise(Num_DSE = n(), .groups = 'keep') -> num_dse_diff

Check the number of DSE in each condition. In the next plot I define if an event is differentially spliced in TAF2, NLS-TAF2∆IDR or both. The GROUP column defines direction of the DSE relative to control for TAF2 and NLS-TAF2∆IDR, respectively, meaning that if an event is not DSE in that condition it is labelled as “NO”

Code
num_dse_diff |>
  mutate(GROUP = gsub(pattern = '_', replacement = '\n', x = GROUP)) |> 
ggplot() +
  facet_wrap(~signif_in, scales = 'free_x') +
  aes(x = GROUP, y = Num_DSE, fill = TYPE) +
  geom_bar(stat = 'identity', colour = 'black', linewidth = 0.2, width = 0.75) + 
  geom_text(aes(label = Num_DSE, group = TYPE), size = 3,
            position = position_stack(vjust = 0.5) ) +
  scale_fill_discrete(name = 'dse type') +
  # scale_x_discrete(guide = guide_axis(n.dodge = 2))+
  scale_y_continuous(expand = expansion(mult = c(0.01, 0.05), add = 0) ) +
  coord_cartesian(clip = 'off')+
  labs(y = 'Num. DSE') +
  theme_classic(base_family = 'Arial', base_size = 8) +
  theme(axis.text = element_text(colour = 'black'),
        # axis.text.x = element_text(angle = 45, hjust = 1),
        axis.title.x = element_blank(),
        axis.line = element_line(colour = 'black', linewidth = 0.2),
        legend.position = c(0.15, 0.8),
        panel.grid.major.x = element_blank(),
        panel.grid.major.y = element_line(linewidth = 0.1),
        panel.grid.minor = element_blank(),
        panel.background = element_blank(),
        plot.background = element_blank(), 
        strip.background = element_blank() )

Code
ggsave(filename = '16_diff_Num_DSE_Direction_Bar.pdf', plot = last_plot(), 
       device = cairo_pdf, path = plot_dir, units = 'cm', width = 12,
       height = 10)

Now that the DSEs have been identified by each method it’s time to check how many events overlap and how many were quantified.

Separate between the exons and the introns that are up or down spliced in TAF2 OE

Code
up_diff_TAF2 <- subset(diff_TAF2, dPSI_TAF2 >= 0)
do_diff_TAF2 <- subset(diff_TAF2, dPSI_TAF2 < 0)

EX_up_diff_TAF2 <- grep(pattern = "^HsaEX", x = up_diff_TAF2$EVENT, value = T)
EX_do_diff_TAF2 <- grep(pattern = "^HsaEX", x = do_diff_TAF2$EVENT, value = T)

IN_up_diff_TAF2 <- grep(pattern = "^HsaINT", x = up_diff_TAF2$EVENT, value = T)
IN_do_diff_TAF2 <- grep(pattern = "^HsaINT", x = do_diff_TAF2$EVENT, value = T)
Code
A3_up_diff_TAF2 <- grep(pattern = "^HsaALTA", x = up_diff_TAF2$EVENT, value = T)
A3_do_diff_TAF2 <- grep(pattern = "^HsaALTA", x = do_diff_TAF2$EVENT, value = T)

A5_up_diff_TAF2 <- grep(pattern = "^HsaALTD", x = up_diff_TAF2$EVENT, value = T)
A5_do_diff_TAF2 <- grep(pattern = "^HsaALTD", x = do_diff_TAF2$EVENT, value = T)

A3_diff_TAF2 <- match_alt35_id(up_ids = A3_up_diff_TAF2, do_ids = A3_do_diff_TAF2)
A5_diff_TAF2 <- match_alt35_id(up_ids = A5_up_diff_TAF2, do_ids = A5_do_diff_TAF2)
Code
message('HeLa TAF OE diff method:',
        '\nUP Exon: ', length(EX_up_diff_TAF2),
        '\nDOWN Exon: ', length(EX_do_diff_TAF2),
        '\nUP Intron: ', length(IN_up_diff_TAF2),
        '\nDOWN Intron: ', length(IN_do_diff_TAF2),
        "\nDifferentially spliced 3' splice sites: ", length(A3_diff_TAF2),
        "\nDifferentially spliced 5' splice sites: ", length(A5_diff_TAF2))
HeLa TAF OE diff method:
UP Exon: 19
DOWN Exon: 21
UP Intron: 13
DOWN Intron: 17
Differentially spliced 3' splice sites: 28
Differentially spliced 5' splice sites: 32

Separate between the exons and the introns that are up or down spliced in TAF2∆IDR OE

Code
up_diff_TAF2dIDR <- subset(diff_TAF2dIDR, dPSI_TAF2dIDR >= 0)
do_diff_TAF2dIDR <- subset(diff_TAF2dIDR, dPSI_TAF2dIDR < 0)

EX_up_diff_TAF2dIDR <- grep(pattern = "^HsaEX", x = up_diff_TAF2dIDR$EVENT, value = T)
EX_do_diff_TAF2dIDR <- grep(pattern = "^HsaEX", x = do_diff_TAF2dIDR$EVENT, value = T)

IN_up_diff_TAF2dIDR <- grep(pattern = "^HsaINT", x = up_diff_TAF2dIDR$EVENT, value = T)
IN_do_diff_TAF2dIDR <- grep(pattern = "^HsaINT", x = do_diff_TAF2dIDR$EVENT, value = T)
Code
A3_up_diff_TAF2dIDR <- grep(pattern = "^HsaALTA", x = up_diff_TAF2dIDR$EVENT, value = T)
A3_do_diff_TAF2dIDR <- grep(pattern = "^HsaALTA", x = do_diff_TAF2dIDR$EVENT, value = T)

A5_up_diff_TAF2dIDR <- grep(pattern = "^HsaALTD", x = up_diff_TAF2dIDR$EVENT, value = T)
A5_do_diff_TAF2dIDR <- grep(pattern = "^HsaALTD", x = do_diff_TAF2dIDR$EVENT, value = T)

A3_diff_TAF2dIDR <- match_alt35_id(up_ids = A3_up_tidy_TAF2dIDR, do_ids = A3_do_tidy_TAF2dIDR)
A5_diff_TAF2dIDR <- match_alt35_id(up_ids = A5_up_tidy_TAF2dIDR, do_ids = A5_do_tidy_TAF2dIDR)
Code
message('HeLa TAF OE diff method:',
        '\nUP Exon: ', length(EX_up_diff_TAF2dIDR),
        '\nDOWN Exon: ', length(do_diff_TAF2dIDR),
        '\nUP Intron: ', length(IN_up_diff_TAF2dIDR),
        '\nDOWN Intron: ', length(IN_do_diff_TAF2dIDR),
        "\nDifferentially spliced 3' splice sites: ", length(A3_diff_TAF2dIDR),
        "\nDifferentially spliced 5' splice sites: ", length(A5_diff_TAF2dIDR))
HeLa TAF OE diff method:
UP Exon: 13
DOWN Exon: 4
UP Intron: 10
DOWN Intron: 31
Differentially spliced 3' splice sites: 1
Differentially spliced 5' splice sites: 8

6 DSE overlap

6.1 By AS event type

6.1.1 Exons

Code
EX_up_TAF2 <- unique(c(EX_up_cmpr_TAF2, EX_up_tidy_TAF2, EX_up_diff_TAF2))
EX_do_TAF2 <- unique(c(EX_do_cmpr_TAF2, EX_do_tidy_TAF2, EX_do_diff_TAF2))

EX_up_TAF2dIDR <- unique(c(EX_up_cmpr_TAF2dIDR, EX_up_tidy_TAF2dIDR, EX_up_diff_TAF2dIDR))
EX_do_TAF2dIDR <- unique(c(EX_do_cmpr_TAF2dIDR, EX_do_tidy_TAF2dIDR, EX_do_diff_TAF2dIDR))

dsexons_list <- list(EX_up_TAF2, EX_up_TAF2dIDR, EX_do_TAF2, EX_do_TAF2dIDR)
 
names(dsexons_list) <- c('EX UP TAF2', 'EX UP ∆IDR', 'EX DOWN TAF2', 'EX DOWN ∆IDR')
Code
intersection_exons_ids <- upset(fromList(dsexons_list), text.scale = 1.5, 
                                nintersects = 6, 
                                mb.ratio = c(0.7, 0.3),
                                point.size = 5, order.by = 'freq', keep.order = T,
                                sets = rev(names(dsexons_list)),
                                main.bar.color = "gray16") 

Show the number and overlap of events found by each method with an upset plot:

Code
intersection_exons_ids

Code
cairo_pdf(file = file.path(plot_dir, '17_overlap_exons_Upset.pdf'), 
          width = 4.1, height = 4, bg = NA)
intersection_exons_ids
dev.off()
png 
  2 

6.1.2 Introns

Code
IN_up_TAF2 <- unique(c(IN_up_cmpr_TAF2, IN_up_tidy_TAF2, IN_up_diff_TAF2))
IN_do_TAF2 <- unique(c(IN_do_cmpr_TAF2, IN_do_tidy_TAF2, IN_do_diff_TAF2))

IN_up_TAF2dIDR <- unique(c(IN_up_cmpr_TAF2dIDR, IN_up_tidy_TAF2dIDR, IN_up_diff_TAF2dIDR))
IN_do_TAF2dIDR <- unique(c(IN_do_cmpr_TAF2dIDR, IN_do_tidy_TAF2dIDR, IN_do_diff_TAF2dIDR))
 
dsintrons_list <- list(IN_up_TAF2, IN_up_TAF2dIDR, IN_do_TAF2, IN_do_TAF2dIDR)
names(dsintrons_list) <- c('INT UP TAF2', 'INT UP ∆IDR', 'INT DOWN TAF2', 'INT DOWN ∆IDR')

Make upset plot introns

Code
intersection_introns_ids <- upset(fromList(dsintrons_list), text.scale = 1.5, nintersects = NA, 
                                  mb.ratio = c(0.7, 0.3), 
                                  point.size = 5, order.by = 'freq', keep.order = T,
                                  sets = rev(names(dsintrons_list)),
                                  main.bar.color = "gray16") 

Show the number and overlap of events found by each method with an upset plot:

Code
intersection_introns_ids

Code
cairo_pdf(file = file.path(plot_dir, '18_overlap_introns_Upset.pdf'), 
          width = 4.1, height = 4, bg = NA)
intersection_introns_ids
dev.off()
png 
  2 

6.1.3 Alt 3’ or 5’ splice site

Code
A3_TAF2 <- unique(c(A3_cmpr_TAF2, A3_tidy_TAF2, A3_diff_TAF2))
A5_TAF2 <- unique(c(A5_cmpr_TAF2, A5_tidy_TAF2, A5_diff_TAF2))

A3_TAF2dIDR <- unique(c(A3_cmpr_TAF2dIDR, A3_tidy_TAF2dIDR, A3_diff_TAF2dIDR))
A5_TAF2dIDR <- unique(c(A5_cmpr_TAF2dIDR, A5_tidy_TAF2dIDR, A5_diff_TAF2dIDR))

dsaltss_list <- list(A3_TAF2, A3_TAF2dIDR, A5_TAF2, A5_TAF2dIDR)
names(dsaltss_list) <- c('ALT 3SS TAF2', 'ALT 3SS ∆IDR', 'ALT 5SS TAF2', 'ALT 5SS ∆IDR')

Make upset plot alternative splice sites

Code
intersection_ss_ids <- upset(fromList(dsaltss_list), text.scale = 1.5, nintersects = NA, 
                                  mb.ratio = c(0.7, 0.3), 
                                  point.size = 5, order.by = 'freq', keep.order = T,
                                  sets = rev(names(dsaltss_list)),
                                  main.bar.color = "gray16") 

Show the number and overlap of events found by each method with an upset plot:

Code
intersection_ss_ids

Code
cairo_pdf(file = file.path(plot_dir, '19_overlap_alt_splice_sites_Upset.pdf'), 
          width = 4.1, height = 4, bg = NA)
intersection_introns_ids
dev.off()
png 
  2 

6.2 By method

Plot how many differentially Spliced Events by method? This code doesn’t separate between exons or introns and between up or down-regulated events.

Check the number of DSE found by each method.

Code
dse_list_uniq <- list(bona_fide_TAF2_ids, bona_fide_TAF2dIDR_ids, 
                      stat_TAF2, stat_TAF2dIDR, 
                      diff_events_TAF2, diff_events_TAF2dIDR)
 
names(dse_list_uniq) <- c('TAF2_stringent_cmpr', 'TAF2dIDR_stringent_cmpr',
                           'TAF2_tidy', 'TAF2dIDR_tidy', 
                           'TAF2_diff', 'TAF2dIDR_diff')
Code
intersection_methods_ids <- upset(fromList(dse_list_uniq), nsets = length(dse_list_uniq),
                                  text.scale = 1.5, nintersects = NA, 
                                  mb.ratio = c(0.6, 0.4), decreasing = c(T, F),
                                  order.by = 'freq', main.bar.color = "gray16") 

Show the number and overlap of events found by each method with an upset plot:

Code
intersection_methods_ids

If we instead consider the relaxed compared method coming from the super samples

Code
dse_list_mrgd <- list(cmpr_TAF2_all_ids, cmpr_TAF2dIDR_all_ids,
                 stat_TAF2, stat_TAF2dIDR, 
                 diff_events_TAF2, diff_events_TAF2dIDR)
 
names(dse_list_mrgd) <- c('TAF2_relaxed_cmpr', 'TAF2dIDR_relaxed_cmpr',
                          'TAF2_tidy', 'TAF2dIDR_tidy', 
                          'TAF2_diff', 'TAF2dIDR_diff')
Code
intersection_methods_ids_mrgd <- upset(fromList(dse_list_mrgd), nsets = length(dse_list_mrgd),
                                       text.scale = 1.5, nintersects = NA, 
                                       mb.ratio = c(0.6, 0.4), decreasing = c(T, F),
                                       order.by = 'freq', main.bar.color = "gray16") 

Show the number and overlap of events found by each method with an upset plot:

Code
intersection_methods_ids_mrgd

7 Export DSE and data

Note

Export the significant events

7.1 Exons

Recreate Upset plot sets.

Code
EX_up_common <- EX_up_TAF2[(EX_up_TAF2 %in% EX_up_TAF2dIDR)] 
EX_do_common <- EX_do_TAF2[(EX_do_TAF2 %in% EX_do_TAF2dIDR)] 

EX_up_uniqTAF2 <- EX_up_TAF2[!(EX_up_TAF2 %in% c(EX_up_common, EX_up_TAF2dIDR))]
EX_do_uniqTAF2 <- EX_do_TAF2[!(EX_do_TAF2 %in% c(EX_do_common, EX_do_TAF2dIDR, EX_up_TAF2, EX_up_TAF2dIDR))] 

EX_up_uniqTAF2dIDR <- EX_up_TAF2dIDR[!(EX_up_TAF2dIDR %in% 
                                           c(EX_up_common, EX_up_TAF2, EX_do_TAF2, EX_do_TAF2dIDR))]
EX_do_uniqTAF2dIDR <- EX_do_TAF2dIDR[!(EX_do_TAF2dIDR %in% c(EX_do_common, EX_up_TAF2, EX_up_common))] 

Export to file the differentially spliced exons IDs

Code
write_delim(x = as.data.frame(EX_up_common), col_names = F, append = F, quote = 'none', escape = 'none',
            file = file.path(dse_IDs_dir, '1_UP_EXONS_BOTH.tab'), delim = '\t')

write_delim(x = as.data.frame(EX_do_common), col_names = F, append = F, quote = 'none', escape = 'none',
            file = file.path(dse_IDs_dir, '2_DOWN_EXONS_BOTH.tab'), delim = '\t')

write_delim(x = as.data.frame(EX_up_uniqTAF2), col_names = F, append = F, quote = 'none', escape = 'none',
            file = file.path(dse_IDs_dir, '3_UP_EXONS_TAF2.tab'), delim = '\t')

write_delim(x = as.data.frame(EX_do_uniqTAF2), col_names = F, append = F, quote = 'none', escape = 'none',
            file = file.path(dse_IDs_dir, '4_DOWN_EXONS_TAF2.tab'), delim = '\t')

write_delim(x = as.data.frame(EX_up_uniqTAF2dIDR), col_names = F, append = F, quote = 'none', escape = 'none',
            file = file.path(dse_IDs_dir, '5_UP_EXONS_TAF2dIDR.tab'), delim = '\t')

write_delim(x = as.data.frame(EX_do_uniqTAF2dIDR), col_names = F, append = F, quote = 'none', escape = 'none',
            file = file.path(dse_IDs_dir, '6_DOWN_EXONS_TAF2dIDR.tab'), delim = '\t')

7.2 Introns

Recreate Upset plot sets.

Code
IN_up_common <- IN_up_TAF2[(IN_up_TAF2 %in% IN_up_TAF2dIDR)] 
IN_do_common <- IN_do_TAF2[(IN_do_TAF2 %in% IN_do_TAF2dIDR)] 

IN_up_uniqTAF2 <- IN_up_TAF2[!(IN_up_TAF2 %in% c(IN_up_common, IN_up_TAF2dIDR))]
IN_do_uniqTAF2 <- IN_do_TAF2[!(IN_do_TAF2 %in% c(IN_do_common, IN_do_TAF2dIDR, IN_up_TAF2, IN_up_TAF2dIDR))] 

IN_up_uniqTAF2dIDR <- IN_up_TAF2dIDR[!(IN_up_TAF2dIDR %in% c(IN_up_common, IN_up_TAF2, IN_do_TAF2, IN_do_TAF2dIDR))]
IN_do_uniqTAF2dIDR <- IN_do_TAF2dIDR[!(IN_do_TAF2dIDR %in% c(IN_do_common, IN_up_TAF2, IN_up_common))] 

Export to file the differentially spliced introns IDs

Code
write_delim(x = as.data.frame(IN_up_common), col_names = F, append = F, quote = 'none', escape = 'none',
            file = file.path(dse_IDs_dir, '1_UP_INTRONS_BOTH.tab'), delim = '\t')

write_delim(x = as.data.frame(IN_do_common), col_names = F, append = F, quote = 'none', escape = 'none',
            file = file.path(dse_IDs_dir, '2_DOWN_INTRONS_BOTH.tab'), delim = '\t')

write_delim(x = as.data.frame(IN_up_uniqTAF2), col_names = F, append = F, quote = 'none', escape = 'none',
            file = file.path(dse_IDs_dir, '3_UP_INTRONS_TAF2.tab'), delim = '\t')

write_delim(x = as.data.frame(IN_do_uniqTAF2), col_names = F, append = F, quote = 'none', escape = 'none',
            file = file.path(dse_IDs_dir, '4_DOWN_INTRONS_TAF2.tab'), delim = '\t')

write_delim(x = as.data.frame(IN_up_uniqTAF2dIDR), col_names = F, append = F, quote = 'none', escape = 'none',
            file = file.path(dse_IDs_dir, '5_UP_INTRONS_TAF2dIDR.tab'), delim = '\t')

write_delim(x = as.data.frame(IN_do_uniqTAF2dIDR), col_names = F, append = F, quote = 'none', escape = 'none',
            file = file.path(dse_IDs_dir, '6_DOWN_INTRONS_TAF2dIDR.tab'), delim = '\t')

7.3 Alternative 3’ and 5’ splice sites

Code
A3_common <- A3_TAF2[(A3_TAF2 %in% A3_TAF2dIDR)] 
A5_common <- A5_TAF2[(A5_TAF2 %in% A5_TAF2dIDR)] 

A3_uniqTAF2 <- A3_TAF2[!A3_TAF2 %in% A3_TAF2dIDR]
stopifnot(length(A3_uniqTAF2) + length(A3_common) == length(A3_TAF2))

A3_uniqTAF2dIDR <- A3_TAF2dIDR[!(A3_TAF2dIDR %in% A3_TAF2)]
stopifnot(length(A3_uniqTAF2dIDR) + length(A3_common) == length(A3_TAF2dIDR))

A5_uniqTAF2 <- A5_TAF2[!A5_TAF2 %in% A5_TAF2dIDR]
stopifnot(length(A5_uniqTAF2) + length(A5_common) == length(A5_TAF2))

A5_uniqTAF2dIDR <- A5_TAF2dIDR[!(A5_TAF2dIDR %in% A5_TAF2)]
stopifnot(length(A5_uniqTAF2dIDR) + length(A5_common) == length(A5_TAF2dIDR))

Export to file the differentially spliced splice sites IDs

Code
write_delim(x = as.data.frame(A3_common), col_names = F, append = F, quote = 'none', escape = 'none',
            file = file.path(dse_IDs_dir, '1_ALT_ACCEPTOR_SPLICE_SITES_BOTH.tab'), delim = '\t')

write_delim(x = as.data.frame(A5_common), col_names = F, append = F, quote = 'none', escape = 'none',
            file = file.path(dse_IDs_dir, '2_ALT_DONOR_SPLICE_SITES_BOTH.tab'), delim = '\t')

write_delim(x = as.data.frame(A3_uniqTAF2), col_names = F, append = F, quote = 'none', escape = 'none',
            file = file.path(dse_IDs_dir, '3_ALT_ACCEPTOR_SPLICE_SITES_TAF2.tab'), delim = '\t')

write_delim(x = as.data.frame(A3_uniqTAF2dIDR), col_names = F, append = F, quote = 'none', escape = 'none',
            file = file.path(dse_IDs_dir, '4_ALT_ACCEPTOR_SPLICE_SITES_TAF2dIDR.tab'), delim = '\t')

write_delim(x = as.data.frame(A5_uniqTAF2), col_names = F, append = F, quote = 'none', escape = 'none',
            file = file.path(dse_IDs_dir, '5_ALT_DONOR_SPLICE_SITES_TAF2.tab'), delim = '\t')

write_delim(x = as.data.frame(A5_uniqTAF2dIDR), col_names = F, append = F, quote = 'none', escape = 'none',
            file = file.path(dse_IDs_dir, '6_ALT_DONOR_SPLICE_SITES_TAF2dIDR.tab'), delim = '\t')

7.4 Combine all into one dataframe

Code
rbind(
  data.frame(EXPERIMENT = 'TAF2',
             TYPE = "Exon",
             EVENT = c(EX_up_common, EX_up_uniqTAF2),
             DIRECTION = 'UP'),
  
  data.frame(EXPERIMENT = 'TAF2∆IDR',
             TYPE = "Exon",
             EVENT = c(EX_up_common, EX_up_uniqTAF2dIDR),
             DIRECTION = 'UP'),
  
  data.frame(EXPERIMENT = 'TAF2',
             TYPE = "Exon",
             EVENT = c(EX_do_common, EX_do_uniqTAF2),
             DIRECTION = 'DOWN'),
  
  data.frame(EXPERIMENT = 'TAF2∆IDR',
             TYPE = "Exon",
             EVENT = c(EX_do_common, EX_do_uniqTAF2dIDR),
             DIRECTION = 'DOWN')
) -> exons_IDs
Code
rbind(
  data.frame(EXPERIMENT = 'TAF2',
             TYPE = "Intron",
             EVENT = c(IN_up_common, IN_up_uniqTAF2),
             DIRECTION = 'UP'),
  
  data.frame(EXPERIMENT = 'TAF2∆IDR',
             TYPE = "Intron",
             EVENT = c(IN_up_common, IN_up_uniqTAF2dIDR),
             DIRECTION = 'UP'),
  
  data.frame(EXPERIMENT = 'TAF2',
             TYPE = "Intron",
             EVENT = c(IN_do_common, IN_do_uniqTAF2),
             DIRECTION = 'DOWN'),
  
  data.frame(EXPERIMENT = 'TAF2∆IDR',
             TYPE = "Intron",
             EVENT = c(IN_do_common, IN_do_uniqTAF2dIDR),
             DIRECTION = 'DOWN')
) -> introns_IDs
Code
rbind(
  data.frame(EXPERIMENT = 'TAF2',
             TYPE = "Alt. 3' ss",
             EVENT = c(A3_common, A3_uniqTAF2)),
  
  data.frame(EXPERIMENT = 'TAF2∆IDR',
             TYPE = "Alt. 3' ss",
             EVENT = c(A3_common, A3_uniqTAF2dIDR)),
  
  data.frame(EXPERIMENT = 'TAF2',
             TYPE = "Alt. 5' ss",
             EVENT = c(A5_common, A5_uniqTAF2)),
  
  data.frame(EXPERIMENT = 'TAF2∆IDR',
             TYPE = "Alt. 5' ss",
             EVENT = c(A5_common, A5_uniqTAF2dIDR))
) -> alt_splice_sites_IDs

alt_splice_sites_IDs$DIRECTION = 'BOTH'

Save to file shared and unique events that are differentially spliced.

Code
shared_and_uniq_AS_events <- rbind(exons_IDs, introns_IDs, alt_splice_sites_IDs)

write_delim(x = shared_and_uniq_AS_events, col_names = T, append = F, quote = 'none', escape = 'none',
            file = file.path(dse_IDs_dir, '0_ALL_SHARED_UNIQUE_ALT_SPLICED_EVENTS.tab'), delim = '\t')

7.5 Annotate the differentially spliced events

Code
SHR_UNI_AS_events <- read_delim(file = file.path(dse_IDs_dir, '0_ALL_SHARED_UNIQUE_ALT_SPLICED_EVENTS.tab'),
                                delim = '\t', quote = 'none', 
                                col_names = T, show_col_types = FALSE)

Annotate these events by fetching the PSI and coordinates from the main inclusion table.

Code
fetch_diff_IDs <- unique(SHR_UNI_AS_events$EVENT)
message(length(fetch_diff_IDs), ' events to get info... this may take some time')
1830 events to get info... this may take some time

This next step might take some time to process.

Code
grep_psi(inclusion_tbl = psi_path, vst_id = fetch_diff_IDs) |>
  tidy_vst_psi() -> shr_uni_psi_df

Merge all the differentially spliced events with the annotation data.

Code
DSE_tbl <- right_join(x = SHR_UNI_AS_events, y = shr_uni_psi_df, by = join_by(EVENT), multiple = "all")
Warning in right_join(x = SHR_UNI_AS_events, y = shr_uni_psi_df, by = join_by(EVENT), : Detected an unexpected many-to-many relationship between `x` and `y`.
ℹ Row 1 of `x` matches multiple rows in `y`.
ℹ Row 901 of `y` matches multiple rows in `x`.
ℹ If a many-to-many relationship is expected, set `relationship =
  "many-to-many"` to silence this warning.

Export the annotated differentially spliced events.

Code
stopifnot( length(unique(SHR_UNI_AS_events$EVENT)) == length(unique(DSE_tbl$EVENT)) )

write_delim(x = DSE_tbl, col_names = T, append = F, quote = 'none', escape = 'none',
            file = file.path(dse_IDs_dir, '0_ALL_SHARED_UNIQUE_ALT_SPLICED_EVENTS_PSI.tab'),
            delim = '\t')

Import this table instead of processing it every time.

Code
DSE_tbl <- read_delim(file = file.path(dse_IDs_dir, '0_ALL_SHARED_UNIQUE_ALT_SPLICED_EVENTS_PSI.tab'),
                                delim = '\t', quote = 'none', 
                                col_names = T, show_col_types = FALSE)

Define samples and supersample groups

Code
mrgd_cntrl <- "HeLa_FRT"
mrgd_TAF2OE <- "HeLa_TAF2"
mrgd_NLSTAF2dIDROE <- "HeLa_NLSTAF2dIDR" 

taf2oe <- c('HeLa_TAF2_A', 'HeLa_TAF2_B', 'HeLa_TAF2_C')
taf2didr <- c('HeLa_NLSTAF2dIDR_A', 'HeLa_NLSTAF2dIDR_B', 'HeLa_NLSTAF2dIDR_C')
cntrls <- c('HeLa_FRT_A', 'HeLa_FRT_B', 'HeLa_FRT_C')

Calculate ∆PSIs for TAF dataset.

Code
dPSI_thshld <- 15
DSE_tbl |>
  group_by(EVENT) |>
  summarise(mPSI_TAF2 = mean(PSI[Sample %in% mrgd_TAF2OE], na.rm = T),
            mPSI_TAF2dIDR = mean(PSI[Sample %in% mrgd_NLSTAF2dIDROE], na.rm = T),
            mPSI_CNTRL = mean(PSI[Sample %in% mrgd_cntrl], na.rm = T) ) |>
  mutate(dPSI_TAF2 = mPSI_TAF2 - mPSI_CNTRL,
         dPSI_TAF2dIDR = mPSI_TAF2dIDR - mPSI_CNTRL ) |>
  subset( is.finite(dPSI_TAF2) & is.finite(dPSI_TAF2dIDR ) ) |>
  select(EVENT, dPSI_TAF2, dPSI_TAF2dIDR) |>
  unique() |>
  left_join(y = DSE_tbl, by = join_by(EVENT)) |>
  relocate(GENE, .before = EVENT) |>
  relocate(starts_with('dPSI'), .after = PSI) |>
  select(GENE, EVENT, starts_with('dPSI')) |>
  unique() |>
  mutate(AS_TYPE = case_when( grepl(pattern = "^HsaALTA", x = EVENT) ~ 'Alt. Acceptor',
                              grepl(pattern = "^HsaALTD", x = EVENT) ~ 'Alt. Donor',
                              grepl(pattern = "^HsaINT", x = EVENT) ~ 'Intron',
                              grepl(pattern = "^HsaEX", x = EVENT) ~ 'Exon') ) |>
  relocate(AS_TYPE, .after = EVENT) |>
  mutate(direction_TAF2 = case_when( dPSI_TAF2 >= dPSI_thshld ~ 'UP',
                                     dPSI_TAF2 <= -dPSI_thshld ~ 'DOWN',
                                     between(dPSI_TAF2, left = -dPSI_thshld, right = dPSI_thshld) ~ 'NONE') ) |>
  mutate(direction_TAF2 = factor(direction_TAF2, levels = c('UP', 'DOWN', 'NONE'))) |>
  mutate(direction_TAF2dIDR = case_when( dPSI_TAF2dIDR >= dPSI_thshld ~ 'UP',
                                        dPSI_TAF2dIDR <= -dPSI_thshld ~ 'DOWN',
                                        between(dPSI_TAF2dIDR, left = -dPSI_thshld, right = dPSI_thshld) ~ 'NONE') ) |>
  mutate(direction_TAF2 = factor(direction_TAF2, levels = c('UP', 'DOWN', 'NONE'))) |>
  arrange(direction_TAF2) -> tidy_dPSI_DSE

Note: these ∆PSIs are are calculated from the mean ∆PSI, similarly to how vast-tools compare would calcualte them, so the events that are found differentially spliced with the diff method are labelled as “NONE” because they are below 15 PSI.

Code
head(subset(tidy_dPSI_DSE, direction_TAF2 == "NONE" & direction_TAF2dIDR == "NONE"))
# A tibble: 6 × 7
  GENE   EVENT AS_TYPE dPSI_TAF2 dPSI_TAF2dIDR direction_TAF2 direction_TAF2dIDR
  <chr>  <chr> <chr>       <dbl>         <dbl> <fct>          <chr>             
1 BNC2   HsaA… Alt. A…   -11.7          -11.7  NONE           NONE              
2 CCDC24 HsaA… Alt. A…   -11.0           -6.12 NONE           NONE              
3 GPSM1  HsaA… Alt. A…     0.240          5.49 NONE           NONE              
4 PAN2   HsaA… Alt. A…    12.1           13.5  NONE           NONE              
5 PHLDB1 HsaA… Alt. A…    12.0            9.26 NONE           NONE              
6 REPS1  HsaA… Alt. A…   -13.0          -10.4  NONE           NONE              

Export the annotated differentially spliced events.

Code
stopifnot( length(unique(SHR_UNI_AS_events$EVENT)) == length(unique(tidy_dPSI_DSE$EVENT)) )

write_delim(x = tidy_dPSI_DSE, col_names = T, append = F, quote = 'none', escape = 'none',
            file = file.path(dse_IDs_dir, '0_ALL_SHARED_UNIQUE_ALT_SPLICED_EVENTS_dPSI.tab'),
            delim = '\t')

Generate a supplementary table

Code
Supp_Tbl_TAF <- 
    left_join(DSE_tbl, tidy_dPSI_DSE, by = join_by(EVENT, GENE) ) |>
    select(!c(TYPE, Quality_Score_Type, direction_TAF2, direction_TAF2dIDR)) |>
    relocate(EVENT, .after = GENE) |>
    relocate(AS_TYPE, .after = COMPLEX) |>
    relocate(DIRECTION, .after = dPSI_TAF2dIDR) |>
    unique() |>
    arrange(desc(dPSI_TAF2)) |>
    print(n = 1)
# A tibble: 25,620 × 14
  EXPERIMENT GENE   EVENT       COORD LENGTH FullCO COMPLEX AS_TYPE Sample   PSI
  <chr>      <chr>  <chr>       <chr>  <dbl> <chr>  <chr>   <chr>   <chr>  <dbl>
1 TAF2∆IDR   ALG1L2 HsaALTD104… chr3…      4 chr3:… Alt5    Alt. D… HeLa_…     0
# ℹ 25,619 more rows
# ℹ 4 more variables: Quality_Score_Value <chr>, dPSI_TAF2 <dbl>,
#   dPSI_TAF2dIDR <dbl>, DIRECTION <chr>

The direction column indicates which event are up or down regulated in alone or are both up&down regulated if they are alternative splice sites.

Code
Supp_Tbl_TAF |>
    select(EVENT, DIRECTION) |>
    unique() |>
    pull(DIRECTION) |>
    table()

BOTH DOWN   UP 
 597  630  603 

Write to file as a tab delimited file.

Code
write_delim(x = Supp_Tbl_TAF, col_names = T, append = F, quote = 'none', escape = 'none',
            file = file.path(dse_IDs_dir, '0_HeLa_TAF2_Supplemtentary_Table_DSE_SHARED_UNIQUE.tab'),
            delim = '\t')

8 Conclusion

Note

TAF2 OE results in AS changes.

In summary relative to the control sample the 3 different methods:

  • compare picks up AS events with big ∆PSI differences but could be over stringent. To avoid having unbiased analysis towards highly expressed genes, a super sample is created merging the 3 replicates of each condition into one.

  • tidy picks up AS events with small ∆PSI differences but good agreement between replicates.

  • diff picks up AS events with ∆PSI trends between groups and is less stringent on the accordance between replicates.

So all in all it kind of makes sense that the overlap between the 3 methods is very small.

In total TAF2 and NLS-∆IDR-TAF2 over-expression result in HsaEX0043424, HsaEX0040571, HsaEX0049952, HsaEX1030778, HsaEX0035017, HsaEX1020927, HsaEX0035530, HsaEX0074411, HsaEX0055118, HsaEX0014404, HsaEX0073100, HsaEX0036592, HsaEX0028144, HsaEX0025866, HsaEX0054866, HsaEX0032957, HsaEX0017953, HsaEX0006128, HsaEX1031255, HsaEX0029023, HsaEX0074239, HsaEX1029362, HsaEX0036495, HsaEX0073026, HsaEX0027842, HsaEX0073353, HsaEX0041994, HsaEX0049579, HsaEX0052293, HsaEX0048999, HsaEX0028262, HsaEX0031511, HsaEX0032262, HsaEX0061835, HsaEX1004490, HsaEX1018259, HsaEX0016877, HsaEX0013859, HsaEX0031113, HsaEX0019006, HsaEX0041993, HsaEX6060950, HsaEX0010064, HsaEX6006284, HsaEX0045458, HsaEX7109033, HsaEX0072191, HsaEX0036546, HsaEX0057591, HsaEX0024980, HsaEX0049580, HsaEX0022375, HsaEX0055282, HsaEX6086001, HsaEX0048992, HsaEX0016880, HsaEX0007144, HsaEX0027005, HsaEX0065630, HsaEX7006688, HsaEX0067729, HsaEX0056740, HsaEX0073392, HsaEX0055157, HsaEX6060949, HsaEX0037346, HsaEX0042005, HsaEX0030376, HsaEX6060948, HsaEX0023732, HsaEX0013089, HsaEX0067855, HsaEX0031175, HsaEX1034246, HsaEX0052085, HsaEX0040037, HsaEX6061688, HsaEX1008990, HsaEX0055494, HsaEX0043767, HsaEX0061606, HsaEX0061607, HsaEX6077217, HsaEX6077218, HsaEX6077214, HsaEX6077219, HsaEX1014045, HsaEX0045761, HsaEX0041743, HsaEX0074358, HsaEX0020727, HsaEX0041755, HsaEX1000645, HsaEX0004471, HsaEX0073977, HsaEX1031199, HsaEX6033327, HsaEX0011042, HsaEX0011453, HsaEX0061834, HsaEX0004243, HsaEX0073931, HsaEX6057871, HsaEX0011560, HsaEX0053013, HsaEX0036739, HsaEX0005922, HsaEX0040570, HsaEX0011460, HsaEX0025339, HsaEX0024644, HsaEX1035776, HsaEX0052871, HsaEX1045920, HsaEX0060986, HsaEX0057235, HsaEX0009594, HsaEX0071046, HsaEX0001744, HsaEX0030533, HsaEX0064579, HsaEX0057951, HsaEX0017957, HsaEX0062596, HsaEX0062508, HsaEX7109024, HsaEX0036591, HsaEX1014199, HsaEX0041427, HsaEX0019116, HsaEX0039874, HsaEX0026476, HsaEX7002603, HsaEX0012408, HsaEX0012454, HsaEX0043470, HsaEX0065160, HsaEX1027135, HsaEX1011825, HsaEX0010853, HsaEX0010746, HsaEX0030916, HsaEX0026806, HsaEX0073842, HsaEX1026425, HsaEX0001613, HsaEX0062017, HsaEX0047946, HsaEX0025442, HsaEX0074565, HsaEX0067368, HsaEX0063284, HsaEX1042577, HsaEX0039927, HsaEX1041797, HsaEX0040567, HsaEX6075925, HsaEX0002277, HsaEX0006440, HsaEX0016502, HsaEX0021625, HsaEX1006901, HsaEX0005500, HsaEX0018836, HsaEX0005903, HsaEX0052618, HsaEX0057789, HsaEX0019522, HsaEX0069288, HsaEX0059758, HsaEX0031916, HsaEX0036540, HsaEX0001737, HsaEX0023945, HsaEX0019468, HsaEX0007140, HsaEX0048998, HsaEX0073002, HsaEX0057234, HsaEX0011649, HsaEX0028782, HsaEX6083095, HsaEX0007738, HsaEX0007739, HsaEX0069548, HsaEX0010747, HsaEX0074409, HsaEX0037862, HsaEX0009678, HsaEX0056605, HsaEX0061697, HsaEX0066788, HsaEX7005833, HsaEX0034317, HsaEX0072136, HsaEX0054864, HsaEX0060046, HsaEX0058401, HsaEX0063438, HsaEX0073984, HsaEX0001238, HsaEX1030691, HsaEX1022230, HsaEX0046227, HsaEX0054849, HsaEX0048996, HsaEX0027788, HsaEX1015830, HsaEX0044642, HsaEX0036511, HsaEX0032370, HsaEX6097708, HsaEX0042045, HsaEX0058942, HsaEX1025726, HsaEX0046753, HsaEX1033039, HsaEX0042916, HsaEX0070368, HsaEX0019178, HsaEX1024204, HsaEX1026455, HsaEX0018121, HsaEX0013334, HsaEX0040712, HsaEX1019033, HsaEX0009604, HsaEX0005406, HsaEX0023878, HsaEX0013160, HsaEX0002674, HsaEX0005569, HsaEX0007416, HsaEX0010908, HsaEX0016470, HsaEX0018394, HsaEX0019272, HsaEX0023066, HsaEX0025875, HsaEX0027480, HsaEX0056512, HsaEX0073255, HsaEX0010540, HsaEX0033081, HsaEX1046138, HsaEX0048803, HsaEX1021140, HsaEX6074908, HsaEX1038313, HsaEX0001739, HsaEX1019505, HsaEX0050487, HsaEX0054587, HsaEX0030068, HsaEX0008238, HsaEX0055498, HsaEX0001847, HsaEX1033784, HsaEX1023726, HsaEX0012060, HsaEX0010520, HsaEX0064510, HsaEX1009005, HsaEX1000301, HsaEX0002276, HsaEX0000450, HsaEX0074261, HsaEX0056143, HsaEX0026862, HsaEX0031114, HsaEX7009598, HsaEX7109083, HsaEX0073995, HsaEX1006463, HsaEX0010057, HsaEX0050918, HsaEX0034046, HsaEX0061962, HsaEX1045166, HsaEX0048689, HsaEX0031140, HsaEX0066198, HsaEX0008021, HsaEX0053549, HsaEX1022199, HsaEX0045422, HsaEX0074238, HsaEX0074060, HsaEX0074356, HsaEX0008756, HsaEX1009710, HsaEX0036734, HsaEX0000444, HsaEX0062235, HsaEX0035360, HsaEX0064337, HsaEX0045997, HsaEX0052092, HsaEX0071180, HsaEX0053628, HsaEX0005481, HsaEX0025397, HsaEX0004775, HsaEX0032328, HsaEX1007990, HsaEX1045440, HsaEX1045318, HsaEX0028142, HsaEX0013322, HsaEX0013656, HsaEX0065285, HsaEX0016878, HsaEX1010421, HsaEX1030692, HsaEX0030562, HsaEX0040407, HsaEX0003202, HsaEX0004138, HsaEX0036735, HsaEX0019004, HsaEX0055156, HsaEX0059093, HsaEX0054743, HsaEX0064477, HsaEX0002629, HsaEX0058715, HsaEX0018189, HsaEX0060670, HsaEX0052819, HsaEX0048451, HsaEX0066609, HsaEX0049467, HsaEX0051040, HsaEX0071902, HsaEX0073229, HsaEX0032238, HsaEX0022383, HsaEX0007241, HsaEX0065399, HsaEX0013099, HsaEX0053358, HsaEX0002628, HsaEX0005673, HsaEX0060535, HsaEX0050308, HsaEX0052949, HsaEX0070719, HsaEX0060536, HsaEX0010809, HsaEX0059143, HsaEX0024656, HsaEX0054858, HsaEX0011742, HsaEX0008754, HsaEX1001262, HsaEX6077216, HsaEX0008747, HsaEX0073504, HsaEX0063700, HsaEX0016160, HsaEX6036072, HsaEX0011557, HsaEX0036048, HsaEX0071895, HsaEX0054867, HsaEX0000053, HsaEX0060838, HsaEX0011544, HsaEX0035879, HsaEX0074191, HsaEX0002617, HsaEX1025788, HsaEX0019523, HsaEX0047947, HsaEX0069903, HsaEX1026787, HsaEX0007536, HsaEX0057619, HsaEX0021429, HsaEX0045492, HsaEX0051916, HsaEX1005931, HsaEX0007298, HsaEX0039729, HsaEX0072192, HsaEX0064585, HsaEX0036718, HsaEX0025407, HsaEX1038866, HsaEX1041074, HsaEX0072799, HsaEX0029450, HsaEX0038295, HsaEX0039050, HsaEX0045664, HsaEX0058132, HsaEX0071097, HsaEX0074473, HsaEX0054588, HsaEX0004687, HsaEX1007525, HsaEX7004945, HsaEX0070880, HsaEX0036265, HsaEX0036702, HsaEX0037092, HsaEX0044699, HsaEX0048341, HsaEX0070238, HsaEX0011575, HsaEX0042315, HsaEX0005442, HsaEX0011281, HsaEX0053135, HsaEX0027174, HsaEX0035944, HsaEX6012794, HsaEX0009728, HsaEX0064796, HsaEX0038232, HsaEX0052222, HsaEX0054000, HsaEX0056963, HsaEX0065864, HsaEX1032727, HsaEX0051188, HsaEX0046002, HsaEX1010975, HsaEX0024985, HsaEX1044897, HsaEX0071045, HsaEX0035953, HsaEX1004026, HsaEX0042181, HsaEX1018510, HsaEX1007387, HsaEX0043249, HsaEX1004593, HsaEX0070371, HsaEX0013083, HsaEX1027192, HsaEX0041511, HsaEX0070923, HsaEX1013434, HsaEX0073838, HsaEX6021320, HsaEX0064798, HsaEX0055481, HsaEX1000565, HsaEX1004023, HsaEX0049799, HsaEX0055666, HsaEX0012657, HsaEX1006185, HsaEX0071195, HsaEX1037806, HsaEX6097697, HsaEX0048428, HsaEX0036595, HsaEX0070375, HsaEX1019759, HsaEX0036545, HsaEX7109121, HsaEX1037808, HsaEX0071189, HsaEX1037807, HsaEX0018161, HsaEX0004251, HsaEX0001873, HsaEX0011991, HsaEX0012655, HsaEX0036594, HsaEX6060520, HsaEX0059276, HsaEX0017703, HsaEX7009386, HsaEX1046648, HsaEX0002519, HsaEX0038985, HsaEX0014410, HsaEX0003428, HsaEX0017700, HsaEX1029767, HsaEX1021141, HsaEX0016500, HsaEX7004747, HsaEX0028596, HsaEX0071187, HsaEX6066383, HsaEX0007130, HsaEX1045734, HsaEX0072499, HsaEX1020928, HsaEX0068064, HsaEX0052258, HsaEX0009318, HsaEX0018134, HsaEX1010200, HsaEX0018967, HsaEX0058796, HsaEX0033556, HsaEX0038699, HsaEX0009694, HsaEX0027604, HsaEX6051937, HsaEX0074160, HsaEX0073849, HsaEX0004503, HsaEX0006077, HsaEX0026282, HsaEX0024433, HsaEX0027526, HsaEX0003357, HsaEX0035236, HsaEX1010903, HsaEX1001069, HsaEX0001275, HsaEX0026474, HsaEX0032108, HsaEX0002533, HsaEX0072652, HsaEX6000691, HsaEX0037333, HsaEX0072785, HsaEX0074003, HsaEX0003969, HsaEX1010435, HsaEX0009799, HsaEX0032784, HsaEX0049938, HsaEX0059444, HsaEX0072687, HsaEX0065402, HsaEX1028470, HsaEX1033570, HsaEX0007033, HsaEX0033932, HsaEX0066268, HsaEX0068503, HsaEX0020938, HsaEX0019304, HsaEX0037994, HsaEX0019382, HsaEX6075260, HsaEX0041614, HsaEX0013980, HsaEX0035941, HsaEX0007565, HsaEX0037335, HsaEX1002134, HsaEX0026989, HsaEX1035388, HsaEX0000176, HsaEX0036148, HsaEX0056809, HsaEX0032977, HsaEX0021310, HsaEX0054724, HsaEX0010019, HsaEX0025268, HsaEX0040303, HsaEX6010756, HsaEX0064494, HsaEX1035387, HsaEX0065392, HsaEX0042046, HsaEX0061779, HsaEX1044025, HsaEX0030534, HsaEX0065509, HsaEX0000449, HsaEX0059064, HsaEX0069928, HsaEX0035952, HsaEX0037585, HsaEX0046196, HsaEX0000440, HsaEX0053027, HsaEX0025403, HsaEX7108303, HsaEX0033859, HsaEX7108609, HsaEX1005453, HsaEX0029482, HsaEX1038197, HsaEX0035128, HsaEX0055470, HsaEX1011679, HsaEX0021116, HsaEX0011002, HsaEX1019267, HsaEX0002868, HsaEX0013407, HsaEX0059065, HsaEX0071444, HsaEX1029759, HsaEX1019939, HsaEX0065194, HsaEX0015181, HsaEX0012837, HsaEX0044622, HsaEX0015090, HsaEX0044934, HsaEX0032696, HsaEX0072189, HsaEX0034338, HsaEX0055221, HsaEX0071894, HsaEX0054259, HsaEX0022185, HsaEX0000441, HsaEX0038231, HsaEX0044521, HsaEX0038297, HsaEX0035173, HsaEX0065257, HsaEX0016967, HsaEX0055667, HsaEX0039613, HsaEX0073008, HsaEX0043651, HsaEX0068911, HsaEX6048862, HsaEX0031099, HsaEX0071890, HsaEX0001544, HsaEX0008018, HsaEX1000564, HsaEX0026745, HsaEX0045423, HsaEX0038956, HsaEX0025942, HsaEX6035252, HsaEX0070924, HsaEX0074562, HsaEX0019428, HsaEX0064581, HsaEX0043379, HsaEX0025503, HsaEX0071191, HsaEX0051998, HsaEX0067964, HsaEX0067683, HsaEX0037334, HsaEX6047520, HsaEX0055230, HsaEX0059479, HsaEX0020923, HsaEX0019249, HsaEX0068992, HsaEX0067965, HsaEX0052862, HsaEX0045645, HsaEX0026494, HsaEX0011737, HsaEX0054214, HsaEX0024059, HsaEX0067966, HsaEX0036593, HsaEX0065931, HsaEX0017210, HsaEX0026729, HsaEX0029228, HsaEX0034772, HsaEX0035861, HsaEX0040697, HsaEX0042062, HsaEX0059567, HsaEX0063481, HsaEX0065488, HsaEX0073812, HsaEX6072434, HsaEX0046794, HsaEX6087190, HsaEX1033044, HsaEX0062067, HsaEX0045367, HsaEX0042095, HsaEX0072523, HsaEX0073976, HsaEX0001716, HsaEX0052973, HsaEX0045424, HsaEX0054792, HsaEX0024057, HsaEX0012052, HsaEX0054708, HsaEX6031743, HsaEX0013093, HsaEX0044659, HsaEX0050336, HsaEX0043580, HsaEX0010516, HsaEX0006273, HsaEX0003056, HsaEX0023411, HsaEX0028107, HsaEX0044500, HsaEX0005301, HsaEX0053026, HsaEX0005409, HsaEX0009607, HsaEX0073045, HsaEX0016538, HsaEX0020438, HsaEX1042043, HsaEX0068128, HsaEX0016553, HsaEX0068322, HsaEX0056814, HsaEX0000276, HsaEX0007019, HsaEX0073378, HsaEX6038285, HsaEX0051051, HsaEX1002195, HsaEX0046752, HsaEX0012444, HsaEX0009838, HsaEX0004497, HsaEX0058883, HsaEX0037351, HsaEX0046781, HsaEX0070132, HsaEX0029198, HsaEX0061335, HsaEX7008746, HsaEX0048292, HsaEX0010201, HsaEX1014435, HsaEX0038779, HsaEX0029358, HsaEX0043467, HsaEX1021674, HsaEX0065179, HsaEX7007206, HsaEX0019516, HsaEX0027614, HsaEX0070125, HsaEX0057811, HsaEX0012402, HsaEX0003217, HsaEX0015533, HsaEX0011283, HsaEX0063458, HsaEX0037166, HsaEX0052326, HsaEX0045384, HsaEX0041047, HsaEX0018305, HsaEX0011328, HsaEX6028757, HsaEX6037428, HsaEX0035492, HsaEX0056402, HsaEX0032098, HsaEX0030775, HsaEX0007042, HsaEX1000106, HsaEX0029577, HsaEX0053547, HsaEX0052303, HsaEX6045912, HsaEX1012495, HsaEX0071035, HsaEX0003216, HsaEX0042234, HsaEX0067330, HsaEX0051482, HsaEX0033817, HsaEX0036405, HsaEX1039525, HsaEX0037463, HsaEX0024082, HsaEX0044164, HsaEX0007041, HsaEX0007736, HsaEX6050612, HsaEX0058959, HsaEX0059286, HsaEX0022385, HsaEX0052492, HsaEX0070795, HsaEX0033815, HsaEX0028340, HsaEX0036972, HsaEX7105289, HsaEX1016743, HsaEX1013430, HsaEX7008440, HsaEX0004680, HsaEX0032806, HsaEX0035436, HsaEX0033194, HsaEX0001567, HsaEX0074420, HsaEX0059442, HsaEX0001620, HsaEX0062594, HsaEX1001486, HsaEX1037317, HsaEX0038792, HsaEX0035149, HsaEX0072684, HsaEX0074387, HsaEX0019069, HsaEX0013079, HsaEX0030592, HsaEX0068786, HsaEX0067636, HsaEX0026477, HsaEX0067637, HsaEX6090354, HsaEX0009072, HsaEX0013126, HsaEX0024078, HsaEX0043533, HsaEX0034226, HsaEX0014558, HsaEX0007036, HsaEX0035362, HsaEX1044939, HsaEX0003897, HsaEX1033855, HsaEX0035123, HsaEX0070806, HsaEX0011996, HsaEX0068916, HsaEX0049987, HsaEX0073355, HsaEX0059101, HsaEX0044506, HsaEX0010560, HsaEX0029199, HsaEX0060936, HsaEX0051995, HsaEX0021797, HsaEX0013098, HsaEX0025989, HsaEX0000959, HsaEX0046796, HsaEX0007889, HsaEX0019086, HsaEX0023988, HsaEX0042924, HsaEX0050973, HsaEX1019178, HsaEX6048804, HsaEX1045307, HsaEX0066679, HsaEX0051812, HsaEX0013128, HsaEX6097695, HsaEX0062788, HsaEX0025698, HsaEX0025272, HsaEX0036839, HsaEX0013420, HsaEX1008204, HsaEX0003625, HsaEX0070235, HsaEX0067187, HsaEX0055322, HsaINT1010111, HsaINT0071958, HsaINT0128027, HsaINT0089912, HsaINT0133029, HsaINT0132652, HsaINT1004766, HsaINT0063446, HsaINT1024174, HsaINT0104511, HsaINT0133030, HsaINT0002805, HsaINT0025214, HsaINT0024047, HsaINT0169425, HsaINT1039753, HsaINT0041035, HsaINT0036037, HsaINT0030087, HsaINT1038302, HsaINT1020343, HsaINT0043195, HsaINT1044294, HsaINT0086232, HsaINT1019028, HsaINT0127537, HsaINT0154225, HsaINT0023977, HsaINT0062282, HsaINT0100106, HsaINT0033265, HsaINT1005723, HsaINT0148998, HsaINT0028880, HsaINT0054210, HsaINT0043016, HsaINT1024242, HsaINT0072905, HsaINT0047387, HsaINT0063980, HsaINT1010488, HsaINT0179857, HsaINT0153344, HsaINT0073995, HsaINT0098613, HsaINT1018257, HsaINT0047384, HsaINT0043466, HsaINT0055586, HsaINT0028554, HsaINT0043017, HsaINT1023423, HsaINT1047659, HsaINT1043593, HsaINT1026127, HsaINT0066168, HsaINT1041333, HsaINT0133031, HsaINT0077790, HsaINT0041658, HsaINT0096122, HsaINT0185242, HsaINT1026128, HsaINT0136834, HsaINT0097966, HsaINT0025215, HsaINT0151942, HsaINT0173301, HsaINT0186516, HsaINT0033222, HsaINT0025205, HsaINT0168029, HsaINT0016360, HsaINT1044618, HsaINT0058362, HsaINT0096069, HsaINT0135781, HsaINT0044538, HsaINT1045157, HsaINT0168908, HsaINT0029046, HsaINT1020227, HsaINT0040310, HsaINT0095601, HsaINT0082204, HsaINT1039951, HsaINT0140585, HsaINT0187531, HsaINT0007190, HsaINT0035898, HsaINT0140595, HsaINT0169426, HsaINT0148621, HsaINT0100943, HsaINT0019920, HsaINT1026218, HsaINT0102653, HsaINT0090834, HsaINT0168241, HsaINT0187288, HsaINT1048120, HsaINT1047332, HsaINT0011008, HsaINT0151884, HsaINT0129575, HsaINT0055872, HsaINT0012853, HsaINT0027711, HsaINT0123246, HsaINT0152271, HsaINT0002075, HsaINT0124384, HsaINT0120302, HsaINT0170726, HsaINT0146035, HsaINT1002655, HsaINT0188820, HsaINT0056525, HsaINT1014313, HsaINT0082205, HsaINT1001667, HsaINT0120486, HsaINT0146031, HsaINT0121197, HsaINT1004818, HsaINT0093277, HsaINT0033242, HsaINT0140583, HsaINT1027239, HsaINT0095835, HsaINT1017002, HsaINT0119174, HsaINT0131129, HsaINT0115252, HsaINT0155901, HsaINT0012810, HsaINT0156301, HsaINT1033366, HsaINT0187232, HsaINT1023208, HsaINT0170241, HsaINT0074497, HsaINT0033258, HsaINT0055582, HsaINT1029883, HsaINT0101099, HsaINT1014050, HsaINT0084321, HsaINT0153240, HsaINT0143269, HsaINT0021197, HsaINT0165695, HsaINT1007768, HsaINT1037179, HsaINT1037790, HsaINT1015420, HsaINT1019734, HsaINT0169413, HsaINT1026022, HsaINT0148918, HsaINT0022616, HsaINT1024529, HsaINT1028181, HsaINT1004379, HsaINT1037902, HsaINT0021752, HsaINT1026309, HsaINT0128108, HsaINT0185870, HsaINT1014065, HsaINT1026130, HsaINT0002093, HsaINT0051466, HsaINT0143170, HsaINT0047185, HsaINT0028230, HsaINT0154235, HsaINT0067143, HsaINT0095834, HsaINT0069485, HsaINT0058585, HsaINT0026293, HsaINT1009607, HsaINT0185580, HsaINT0130475, HsaINT0186907, HsaINT0098615, HsaINT0005481, HsaINT0104733, HsaINT0025213, HsaINT0139884, HsaINT1022780, HsaINT0092503, HsaINT1036692, HsaINT0021976, HsaINT1045884, HsaINT1040105, HsaINT0053434, HsaINT0051453, HsaINT0168030, HsaINT0111581, HsaINT0001237, HsaINT0148994, HsaINT1003646, HsaINT0093843, HsaINT0015972, HsaINT0097964, HsaINT0089283, HsaINT0002414, HsaINT0148018, HsaINT0182501, HsaINT0151012, HsaINT0126337, HsaINT0127539, HsaINT0167828, HsaINT1003169, HsaINT0086020, HsaINT0129106, HsaINT1034985, HsaINT1045031, HsaINT1027190, HsaINT0047048, HsaINT0140388, HsaINT1024910, HsaINT1042051, HsaINT1026647, HsaINT0114361, HsaINT0136676, HsaINT1017197, HsaINT0035981, HsaINT0179017, HsaINT0025140, HsaINT0001236, HsaINT0082090, HsaINT0009479, HsaINT0184011, HsaINT1031916, HsaINT0180264, HsaINT0090960, HsaINT1030406, HsaINT1047813, HsaINT0050456, HsaINT0063280, HsaINT0120120, HsaINT0098623, HsaINT0095058, HsaINT0188404, HsaINT0059829, HsaINT0129566, HsaINT0082203, HsaINT1028415, HsaINT0066592, HsaINT0161312, HsaINT0012083, HsaINT0093078, HsaINT0154682, HsaINT0141361, HsaINT1045023, HsaINT0120119, HsaINT1009710, HsaINT1043191, HsaINT0106458, HsaINT0140384, HsaINT0050569, HsaINT0031478, HsaINT0043018, HsaINT0186964, HsaINT0187200, HsaINT1048151, HsaINT1036113, HsaINT0140455, HsaINT1038627, HsaINT1015220, HsaINT1004361, HsaINT1015862, HsaINT1015827, HsaINT0043697, HsaINT0040307, HsaINT0145690, HsaINT1019656, HsaINT0096999, HsaINT0009480, HsaINT0005458, HsaINT1047294, HsaINT1014432, HsaINT0150071, HsaINT0040308, HsaINT1042005, HsaINT1002961, HsaINT0168356, HsaINT1038139, HsaINT0186975, HsaINT0002403, HsaINT1025720, HsaINT0122610, HsaINT0093652, HsaINT0169410, HsaINT0012085, HsaINT0113823, HsaINT1032082, HsaINT0002413, HsaINT1032871, HsaINT0165715, HsaINT0165390, HsaINT0089924, HsaINT0062942, HsaINT0033241, HsaINT1034857, HsaINT0106180, HsaINT1033072, HsaINT0126414, HsaINT0066599, HsaINT0018731, HsaINT0106977, HsaINT0084326, HsaINT0151034, HsaINT1015859, HsaINT0188371, HsaINT0055641, HsaINT1023418, HsaINT1046274, HsaINT0022896, HsaINT0127311, HsaINT1003529, HsaINT1014146, HsaINT1028179, HsaINT0060701, HsaINT0108295, HsaINT0040403, HsaINT0140389, HsaINT1014576, HsaINT0045495, HsaINT0070137, HsaINT0092378, HsaINT1025539, HsaINT0084318, HsaINT1005660, HsaINT1022719, HsaINT1014213, HsaINT1000099, HsaINT0120107, HsaINT0179823, HsaINT1030282, HsaINT0182021, HsaINT0167825, HsaINT0011557, HsaINT0121968, HsaINT0027762, HsaINT1023210, HsaINT0058229, HsaINT0169495, HsaINT0187647, HsaINT1020854, HsaINT0111200, HsaINT0052676, HsaINT0115916, HsaINT1030937, HsaINT1027523, HsaINT0011021, HsaINT0012974, HsaINT0112629, HsaINT0100942, HsaINT1014737, HsaINT1037477, HsaINT1034307, HsaINT1008370, HsaINT0120106, HsaINT0091740, HsaINT0181059, HsaINT0035223, HsaALTA1003439-1/2, HsaALTA0009384-4/4, HsaALTA1030104-4/4, HsaALTA0008342-2/4, HsaALTA0004816-2/2, HsaALTA1017871-1/4, HsaALTA0002547-1/2, HsaALTA0008401-2/4, HsaALTA1039457-6/6, HsaALTA1035971-1/2, HsaALTA1028145-1/3, HsaALTA1037464-2/2, HsaALTA0000997-1/4, HsaALTA1048850-1/4, HsaALTA1039223-2/2, HsaALTA1064103-1/2, HsaALTA1008883-3/3, HsaALTA1063436-4/4, HsaALTA1059899-1/2, HsaALTA1019377-1/2, HsaALTA0008835-1/2, HsaALTA0009786-2/3, HsaALTA1045152-2/4, HsaALTA1043983-4/5, HsaALTA1043532-2/2, HsaALTA0008259-2/2, HsaALTA1013474-2/3, HsaALTA1044814-1/2, HsaALTA1033153-1/2, HsaALTA1046082-2/2, HsaALTA1059332-2/2, HsaALTA1049092-1/2, HsaALTA0009865-1/2, HsaALTA1040283-3/4, HsaALTA1017473-1/2, HsaALTA1038794-1/2, HsaALTA1064094-2/2, HsaALTA1008424-2/2, HsaALTA1056958-2/3, HsaALTA1007768-1/3, HsaALTA1050687-2/2, HsaALTA1029657-4/4, HsaALTA1032776-4/4, HsaALTA1042894-1/2, HsaALTA0006885-1/2, HsaALTA0004924-2/3, HsaALTA0004766-2/2, HsaALTA1022535-2/2, HsaALTA1000595-1/2, HsaALTA0007008-1/2, HsaALTA1029469-2/2, HsaALTA1011892-2/2, HsaALTA1025027-1/2, HsaALTA0004566-1/2, HsaALTA0006527-3/3, HsaALTA1046544-1/2, HsaALTA0006052-1/2, HsaALTA1019387-1/4, HsaALTA0002748-3/6, HsaALTA1045634-2/2, HsaALTA1024719-1/3, HsaALTA1018246-3/3, HsaALTA1012126-2/2, HsaALTA1029472-1/3, HsaALTA1046919-1/2, HsaALTA0007010-1/2, HsaALTA1018633-3/3, HsaALTA0007877-2/2, HsaALTA1006050-1/2, HsaALTA1028148-1/3, HsaALTA1019597-3/3, HsaALTA0002463-2/2, HsaALTA1056505-6/7, HsaALTA0006097-2/2, HsaALTA1046463-1/3, HsaALTA0004895-2/2, HsaALTA1062122-1/2, HsaALTA1031113-1/2, HsaALTA0006185-2/2, HsaALTA1042599-2/2, HsaALTA1023284-1/3, HsaALTA1014350-3/3, HsaALTA0009735-1/2, HsaALTA0001229-4/4, HsaALTA0009734-2/4, HsaALTA1020423-1/2, HsaALTA1023933-2/2, HsaALTA1018013-3/3, HsaALTA1050488-3/3, HsaALTA1038937-1/2, HsaALTA0002585-3/3, HsaALTA1056764-2/2, HsaALTA1061892-1/3, HsaALTA1046644-1/2, HsaALTA1010940-2/2, HsaALTA1039566-1/6, HsaALTA1024479-2/3, HsaALTA1026576-1/5, HsaALTA0000878-1/2, HsaALTA1004788-1/4, HsaALTA1033826-3/3, HsaALTA1046358-1/2, HsaALTA1027517-1/2, HsaALTA0006061-1/2, HsaALTA1054159-1/2, HsaALTA0005796-2/2, HsaALTA1061450-2/3, HsaALTA0009804-1/2, HsaALTA1064390-2/3, HsaALTA1016744-2/2, HsaALTA1014084-1/2, HsaALTA1056309-3/3, HsaALTA1050928-2/5, HsaALTA1007836-1/2, HsaALTA1020650-2/3, HsaALTA0000829-1/2, HsaALTA1045061-5/5, HsaALTA1044727-3/3, HsaALTA0000701-2/4, HsaALTA1055018-1/3, HsaALTA0001079-2/3, HsaALTA0003161-1/3, HsaALTA0008956-3/3, HsaALTA1039804-1/3, HsaALTA1059128-3/4, HsaALTA1050787-1/5, HsaALTA1051084-1/2, HsaALTA0009317-2/2, HsaALTA0009047-1/2, HsaALTA1062343-3/5, HsaALTA0006977-3/8, HsaALTA1044072-2/3, HsaALTA0003036-1/2, HsaALTA1018794-2/3, HsaALTA0009748-1/2, HsaALTA1006244-2/3, HsaALTA1052242-1/2, HsaALTA0002675-2/2, HsaALTA1046587-2/2, HsaALTA1056188-1/2, HsaALTA1039617-2/2, HsaALTA1058683-1/2, HsaALTA0006933-2/2, HsaALTA1055689-1/3, HsaALTA1020849-2/2, HsaALTA0005833-2/3, HsaALTA1045412-1/2, HsaALTA1058527-2/3, HsaALTA1052530-1/2, HsaALTA1050829-2/2, HsaALTA1039896-3/4, HsaALTA0001291-2/2, HsaALTA0009537-2/3, HsaALTA1057187-2/2, HsaALTA1003339-1/2, HsaALTA0001246-2/2, HsaALTA0009853-1/2, HsaALTA1057537-2/2, HsaALTA1034474-2/2, HsaALTA1053419-1/3, HsaALTA0003348-1/3, HsaALTA1016161-1/2, HsaALTA1046607-1/4, HsaALTA0005651-2/3, HsaALTA0007083-1/4, HsaALTA0001463-1/2, HsaALTA0006256-2/2, HsaALTA1001090-3/3, HsaALTA1015934-2/4, HsaALTA0001572-1/5, HsaALTA0007106-4/5, HsaALTA0008669-2/4, HsaALTA1009900-2/2, HsaALTA1035717-1/3, HsaALTA0009196-1/5, HsaALTA1041349-2/2, HsaALTA1038020-1/2, HsaALTA1042067-1/2, HsaALTA1059152-1/2, HsaALTA0003783-3/3, HsaALTA1060192-3/3, HsaALTA1033627-1/2, HsaALTA0000958-1/2, HsaALTA1054316-2/2, HsaALTA1018794-3/3, HsaALTA1064390-1/3, HsaALTA0006506-1/2, HsaALTA1046919-2/2, HsaALTA1039223-1/2, HsaALTA1013474-1/3, HsaALTA1059899-2/2, HsaALTA1000641-2/2, HsaALTA0007098-2/2, HsaALTA0001903-2/2, HsaALTA1049411-1/2, HsaALTA0002603-1/2, HsaALTA1009709-1/2, HsaALTA1056171-2/3, HsaALTA1055400-2/2, HsaALTA1062534-1/2, HsaALTA0008712-3/4, HsaALTA1053300-4/4, HsaALTA1019519-2/2, HsaALTA0006594-2/2, HsaALTA1056443-1/5, HsaALTA1033195-2/2, HsaALTA1053237-2/2, HsaALTA1000093-2/3, HsaALTA0002326-1/3, HsaALTA1059875-1/3, HsaALTA1010737-1/4, HsaALTA1013375-1/2, HsaALTA0008909-2/2, HsaALTA0007173-2/2, HsaALTA1005619-1/5, HsaALTA0000224-1/2, HsaALTA1024397-2/3, HsaALTA1063685-2/3, HsaALTA0006966-2/2, HsaALTA0009738-1/2, HsaALTA1057590-1/2, HsaALTA1062343-1/5, HsaALTA1037461-3/3, HsaALTA1022420-3/3, HsaALTA0008546-1/3, HsaALTA1011866-1/3, HsaALTA1051662-1/3, HsaALTA1028829-1/2, HsaALTA1056636-2/3, HsaALTA0005312-2/2, HsaALTA0008185-3/4, HsaALTA0006082-2/2, HsaALTA1009572-5/6, HsaALTA0009814-2/3, HsaALTA1010753-1/2, HsaALTA0004136-3/7, HsaALTA0009229-2/2, HsaALTA1047209-1/2, HsaALTA0007770-2/2, HsaALTA1058684-1/2, HsaALTA1052587-3/3, HsaALTA1028588-1/2, HsaALTA1063478-4/4, HsaALTA0002633-2/4, HsaALTA1026461-2/2, HsaALTA1026538-1/2, HsaALTA1032779-1/3, HsaALTA1019062-2/2, HsaALTA1051726-1/3, HsaALTA0009687-1/2, HsaALTA1042819-2/4, HsaALTA1060430-2/2, HsaALTA1064070-2/2, HsaALTA1008702-1/2, HsaALTA1031656-2/3, HsaALTA0007293-2/5, HsaALTA1052760-2/2, HsaALTA0006796-2/2, HsaALTA1061582-2/2, HsaALTA1053181-2/4, HsaALTA1002049-5/6, HsaALTA1021889-4/4, HsaALTA0002940-2/4, HsaALTA0002671-3/3, HsaALTA1031112-1/2, HsaALTA0001896-4/4, HsaALTA1052974-1/2, HsaALTA1036642-2/2, HsaALTA0008777-2/4, HsaALTA0001897-4/4, HsaALTA0001692-2/3, HsaALTA0006209-1/2, HsaALTA0004397-2/4, HsaALTA1026576-2/5, HsaALTA1021446-3/3, HsaALTA0009766-2/3, HsaALTA1039408-2/4, HsaALTA1005988-2/2, HsaALTA1046513-2/3, HsaALTA1026084-2/2, HsaALTA1051546-2/2, HsaALTA1047222-1/2, HsaALTA0009463-2/2, HsaALTA1035136-1/2, HsaALTA1022743-1/2, HsaALTA0004644-2/2, HsaALTA0002217-2/2, HsaALTA0003348-2/3, HsaALTA1037004-3/3, HsaALTA1000837-2/3, HsaALTA0002748-4/6, HsaALTA0001603-1/2, HsaALTA0000150-1/2, HsaALTA1047281-2/2, HsaALTA0003124-2/2, HsaALTA1029095-1/2, HsaALTA0006733-2/3, HsaALTA0004108-1/7, HsaALTA1060909-2/2, HsaALTA1036480-4/4, HsaALTA1004123-2/4, HsaALTA1022216-1/5, HsaALTA0001466-2/3, HsaALTA1035490-1/2, HsaALTA0008955-1/2, HsaALTA1022420-2/3, HsaALTA0007434-2/2, HsaALTA0006565-1/2, HsaALTA1052389-1/2, HsaALTA1049845-1/2, HsaALTA1055714-2/2, HsaALTA1051570-3/3, HsaALTA0006611-2/2, HsaALTA1029657-3/4, HsaALTA0006682-2/2, HsaALTA1063884-2/2, HsaALTA1012545-1/3, HsaALTD1026507-1/5, HsaALTD1017380-1/2, HsaALTD1011164-2/2, HsaALTD1045575-2/4, HsaALTD0002200-1/4, HsaALTD1051745-2/2, HsaALTD1009527-2/2, HsaALTD1053720-2/2, HsaALTD0005084-2/2, HsaALTD0003241-3/3, HsaALTD1012110-1/5, HsaALTD1040891-2/3, HsaALTD0002958-4/5, HsaALTD0006433-2/3, HsaALTD0003709-1/2, HsaALTD1052486-1/2, HsaALTD1041176-3/3, HsaALTD1040028-1/5, HsaALTD0005050-2/2, HsaALTD1030129-1/2, HsaALTD1011399-2/2, HsaALTD1019642-1/2, HsaALTD1042796-1/2, HsaALTD1051196-1/2, HsaALTD0006417-3/3, HsaALTD1052389-3/3, HsaALTD1033213-1/3, HsaALTD1031616-2/2, HsaALTD1030031-2/2, HsaALTD0002259-1/2, HsaALTD1002070-2/2, HsaALTD1026142-1/4, HsaALTD1000647-2/2, HsaALTD1049857-2/3, HsaALTD1002296-3/3, HsaALTD1031138-3/4, HsaALTD1008118-3/3, HsaALTD0000198-1/2, HsaALTD1053739-4/4, HsaALTD1052643-1/3, HsaALTD1039523-1/2, HsaALTD1031898-2/3, HsaALTD1044251-1/2, HsaALTD0003442-1/4, HsaALTD0000877-3/3, HsaALTD1038699-1/2, HsaALTD0007511-2/2, HsaALTD1010477-2/2, HsaALTD0004002-1/2, HsaALTD1010195-2/6, HsaALTD1036963-2/4, HsaALTD1051250-2/3, HsaALTD1017638-1/2, HsaALTD1046568-1/2, HsaALTD0004833-1/2, HsaALTD0004595-1/2, HsaALTD1053802-3/3, HsaALTD1018332-2/3, HsaALTD0002834-1/2, HsaALTD1016927-1/3, HsaALTD0004056-1/4, HsaALTD1050678-3/5, HsaALTD0006652-1/3, HsaALTD1045753-2/3, HsaALTD1035628-1/3, HsaALTD1030596-1/2, HsaALTD1046533-3/3, HsaALTD1015612-2/2, HsaALTD0006935-2/2, HsaALTD1051885-1/3, HsaALTD0001377-1/2, HsaALTD1053224-3/4, HsaALTD1009765-2/4, HsaALTD0003127-1/2, HsaALTD1028869-2/4, HsaALTD0006091-1/2, HsaALTD1041243-1/2, HsaALTD1052377-2/2, HsaALTD1027746-1/3, HsaALTD1028287-2/4, HsaALTD1018874-3/3, HsaALTD1043771-2/2, HsaALTD0003429-2/3, HsaALTD1027661-3/8, HsaALTD1021700-2/2, HsaALTD1015564-2/3, HsaALTD1030558-3/8, HsaALTD1046751-3/4, HsaALTD1007309-2/2, HsaALTD0002279-1/2, HsaALTD1052278-1/3, HsaALTD1027427-2/3, HsaALTD1053809-1/2, HsaALTD1027159-2/2, HsaALTD0005082-3/4, HsaALTD0005270-1/3, HsaALTD1038994-3/3, HsaALTD1001094-1/2, HsaALTD1047456-5/9, HsaALTD0006499-1/2, HsaALTD1036300-1/5, HsaALTD1031094-1/4, HsaALTD1052775-1/3, HsaALTD1022260-1/2, HsaALTD0005684-2/2, HsaALTD1045817-2/2, HsaALTD1023675-1/2, HsaALTD1054368-1/5, HsaALTD0007378-3/5, HsaALTD0002718-3/5, HsaALTD1040183-1/3, HsaALTD1036249-2/4, HsaALTD1028165-2/3, HsaALTD1052012-2/11, HsaALTD0005881-1/3, HsaALTD1051541-2/2, HsaALTD1041808-1/2, HsaALTD1048800-1/2, HsaALTD0001508-1/2, HsaALTD1007842-2/2, HsaALTD0004438-2/2, HsaALTD0001770-1/2, HsaALTD1022942-2/3, HsaALTD1047805-1/2, HsaALTD0003030-1/3, HsaALTD1054037-2/2, HsaALTD1041552-1/2, HsaALTD1040232-1/3, HsaALTD0003466-2/2, HsaALTD0000667-2/2, HsaALTD1040934-1/2, HsaALTD1044991-2/2, HsaALTD0004707-4/5, HsaALTD1021372-1/3, HsaALTD1029809-2/2, HsaALTD1054282-1/2, HsaALTD1052854-2/3, HsaALTD1050354-2/2, HsaALTD1049041-1/4, HsaALTD1052990-9/9, HsaALTD0002982-2/4, HsaALTD1049107-4/4, HsaALTD0000446-1/3, HsaALTD0000502-1/2, HsaALTD1035534-3/3, HsaALTD0002120-4/4, HsaALTD0007297-2/2, HsaALTD0000301-1/2, HsaALTD0000750-1/2, HsaALTD1052271-1/3, HsaALTD1007850-2/5, HsaALTD0007378-1/5, HsaALTD1049709-2/2, HsaALTD0005854-3/3, HsaALTD0004342-2/3, HsaALTD1007651-2/2, HsaALTD0005176-2/2, HsaALTD0003444-1/3, HsaALTD1023262-3/3, HsaALTD1030559-3/3, HsaALTD1050556-1/3, HsaALTD0001377-2/2, HsaALTD1037727-3/3, HsaALTD0006433-1/3, HsaALTD0005654-2/2, HsaALTD1046533-2/3, HsaALTD1043472-3/4, HsaALTD0000877-2/3, HsaALTD1041188-1/2, HsaALTD1038960-4/4, HsaALTD1009263-5/6, HsaALTD0004316-2/5, HsaALTD1051196-2/2, HsaALTD1047216-2/2, HsaALTD0004574-2/2, HsaALTD1053437-2/2, HsaALTD1047534-2/2, HsaALTD1028191-1/2, HsaALTD0001214-1/2, HsaALTD1049233-2/2, HsaALTD1051754-3/3, HsaALTD0003683-1/2, HsaALTD1053797-2/2, HsaALTD1010157-2/3, HsaALTD1043770-1/2, HsaALTD1047134-2/2, HsaALTD0003646-3/3, HsaALTD1011478-2/2, HsaALTD0004925-2/2, HsaALTD0001912-3/6, HsaALTD1053745-2/5, HsaALTD1053712-1/2, HsaALTD1044334-1/3, HsaALTD0007394-2/3, HsaALTD1037385-1/2, HsaALTD1037205-2/2, HsaALTD0000365-3/3, HsaALTD0002935-1/2, HsaALTD1045760-2/2, HsaALTD0001069-1/3, HsaALTD0006577-1/2, HsaALTD1000723-2/2, HsaALTD1030196-1/2, HsaALTD1011949-3/4, HsaALTD1038401-2/2, HsaALTD1008017-3/3, HsaALTD1048899-1/2, HsaALTD1015126-1/2, HsaALTD0003020-1/3, HsaALTD0005542-2/2, HsaALTD0003443-1/2, HsaALTD1025236-2/3, HsaALTD0005603-3/4, HsaALTD1007193-2/3, HsaALTD1021716-4/4, HsaALTD1035856-2/2, HsaALTD1051533-3/3, HsaALTD1028826-2/2, HsaALTD1008971-1/2, HsaALTD1040457-1/2, HsaALTD1050771-2/2, HsaALTD1042116-2/4, HsaALTD1035789-1/4, HsaALTD1036404-2/2, HsaALTD0006163-2/2, HsaALTD1005874-2/3, HsaALTD1015509-1/3, HsaALTD0006449-1/2, HsaALTD1053090-3/4, HsaALTD1025657-2/2, HsaALTD1015616-1/2, HsaALTD0002232-1/3, HsaALTD1050585-3/4, HsaALTD1035332-1/2, HsaALTD0006768-2/2, HsaALTD1043944-3/4, HsaALTD0004325-1/2, HsaALTD1037876-1/4, HsaALTD1028162-2/2, HsaALTD1028260-1/2, HsaALTD1040173-1/2, HsaALTD0006054-2/2, HsaALTD1036871-2/2, HsaALTD0006737-2/2, HsaALTD0006070-2/4, HsaALTD0000167-1/2, HsaALTD0003620-1/2, HsaALTD1026905-1/2, HsaALTD1051990-1/3, HsaALTD1024062-2/2, HsaALTD1031036-1/2, HsaALTD1005714-2/3, HsaALTD1008542-6/6, HsaALTD1038347-2/2, HsaALTD1027159-1/2, HsaALTD1035902-1/2, HsaALTD0000450-1/3, HsaALTD1018329-1/3, HsaALTD1053926-2/2, HsaALTD1011066-2/2, HsaALTD0002516-1/2, HsaALTD1035288-1/2, HsaALTD1028132-1/2, HsaALTD1043518-2/3, HsaALTD0007211-2/2, HsaALTD1053824-4/4, HsaALTD1047052-4/4, HsaALTD0006456-2/3, HsaALTD1036214-2/2, HsaALTD1038505-2/2, HsaALTD1033173-3/3, HsaALTD1017266-2/2, HsaALTD0007429-1/2, HsaALTD0005283-3/4, HsaALTD0006833-3/3, HsaALTD1051457-3/3, HsaALTD0000167-2/2, HsaALTD1021047-2/2, HsaALTD1033874-3/4, HsaALTD1042899-2/2 differentially spliced events, that are either shared between the 2 experiments or unique to each over-expression.

9 Session Info

Code
sessioninfo::session_info()
─ Session info ───────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.4.2 (2024-10-31)
 os       AlmaLinux 9.5 (Teal Serval)
 system   x86_64, linux-gnu
 ui       X11
 language en_US.UTF-8
 collate  en_US.UTF-8
 ctype    en_US.UTF-8
 tz       Europe/Madrid
 date     2025-03-03
 pandoc   3.4 @ /users/mirimia/narecco/software/miniconda/envs/analysis/bin/ (via rmarkdown)
 quarto   1.6.41 @ /users/mirimia/narecco/software/miniconda/envs/analysis/bin/quarto

─ Packages ───────────────────────────────────────────────────────────────────
 ! package              * version  date (UTC) lib source
   abind                  1.4-5    2016-07-21 [1] CRAN (R 4.4.1)
   ade4                   1.7-23   2025-02-14 [1] CRAN (R 4.4.2)
   AnnotationDbi          1.68.0   2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
   backports              1.5.0    2024-05-23 [1] CRAN (R 4.4.1)
   Biobase                2.66.0   2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
   BiocFileCache          2.14.0   2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
   BiocGenerics           0.52.0   2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
   BiocParallel           1.40.0   2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
   biomaRt                2.62.0   2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
   Biostrings             2.74.0   2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
   bit                    4.5.0.1  2024-12-03 [1] CRAN (R 4.4.2)
   bit64                  4.5.2    2024-09-22 [1] CRAN (R 4.4.1)
   bitops                 1.0-9    2024-10-03 [1] CRAN (R 4.4.1)
   blob                   1.2.4    2023-03-17 [1] CRAN (R 4.4.1)
   broom                  1.0.7    2024-09-26 [1] CRAN (R 4.4.1)
   cachem                 1.1.0    2024-05-16 [1] CRAN (R 4.4.1)
   Cairo                * 1.6-2    2023-11-28 [1] CRAN (R 4.4.1)
   car                    3.1-3    2024-09-27 [1] CRAN (R 4.4.1)
   carData                3.0-5    2022-01-06 [1] CRAN (R 4.4.1)
   cli                    3.6.4    2025-02-13 [1] CRAN (R 4.4.2)
   codetools              0.2-20   2024-03-31 [1] CRAN (R 4.4.1)
   colorspace             2.1-1    2024-07-26 [1] CRAN (R 4.4.1)
   crayon               * 1.5.3    2024-06-20 [1] CRAN (R 4.4.1)
   csaw                   1.40.0   2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
   curl                   6.0.1    2024-11-14 [1] CRAN (R 4.4.2)
   DBI                    1.2.3    2024-06-02 [1] CRAN (R 4.4.1)
   dbplyr                 2.5.0    2024-03-19 [1] CRAN (R 4.4.1)
   DelayedArray           0.32.0   2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
   desc                   1.4.3    2023-12-10 [1] CRAN (R 4.4.1)
   DESeq2                 1.46.0   2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
   devtools               2.4.5    2022-10-11 [1] CRAN (R 4.4.1)
   digest                 0.6.37   2024-08-19 [1] CRAN (R 4.4.1)
   dplyr                  1.1.4    2023-11-17 [1] CRAN (R 4.4.1)
   edgeR                  4.4.0    2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
   ellipsis               0.3.2    2021-04-29 [1] CRAN (R 4.4.1)
   evaluate               1.0.3    2025-01-10 [1] CRAN (R 4.4.2)
   farver                 2.1.2    2024-05-13 [1] CRAN (R 4.4.1)
   fastmap                1.2.0    2024-05-15 [1] CRAN (R 4.4.1)
   filelock               1.0.3    2023-12-11 [1] CRAN (R 4.4.1)
   forcats                1.0.0    2023-01-29 [1] CRAN (R 4.4.1)
   foreign                0.8-88   2025-01-12 [1] CRAN (R 4.4.2)
   Formula                1.2-5    2023-02-24 [1] CRAN (R 4.4.1)
   fs                     1.6.5    2024-10-30 [1] CRAN (R 4.4.1)
   generics               0.1.3    2022-07-05 [1] CRAN (R 4.4.1)
   GenomeInfoDb           1.42.0   2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
   GenomeInfoDbData       1.2.13   2025-03-03 [1] Bioconductor
   GenomicRanges          1.58.0   2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
   ggalluvial             0.12.5   2023-02-22 [1] CRAN (R 4.4.1)
   ggfittext              0.10.2   2024-02-01 [1] CRAN (R 4.4.1)
   ggforce              * 0.4.2    2024-02-19 [1] CRAN (R 4.4.1)
   ggplot2              * 3.5.1    2024-04-23 [1] CRAN (R 4.4.1)
   ggrepel              * 0.9.6    2024-09-07 [1] CRAN (R 4.4.1)
   ggseqlogo              0.2      2024-02-09 [1] CRAN (R 4.4.1)
   glue                   1.8.0    2024-09-30 [1] CRAN (R 4.4.1)
   gridExtra              2.3      2017-09-09 [1] CRAN (R 4.4.1)
   gtable                 0.3.6    2024-10-25 [1] CRAN (R 4.4.1)
   gtools                 3.9.5    2023-11-20 [1] CRAN (R 4.4.1)
   hms                    1.1.3    2023-03-21 [1] CRAN (R 4.4.1)
   htmltools              0.5.8.1  2024-04-04 [1] CRAN (R 4.4.1)
   htmlwidgets            1.6.4    2023-12-06 [1] CRAN (R 4.4.1)
   httpuv                 1.6.15   2024-03-26 [1] CRAN (R 4.4.1)
   httr                   1.4.7    2023-08-15 [1] CRAN (R 4.4.1)
   httr2                  1.1.0    2025-01-18 [1] CRAN (R 4.4.2)
   IRanges                2.40.0   2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
   jsonlite               1.9.0    2025-02-19 [1] CRAN (R 4.4.2)
   KEGGREST               1.46.0   2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
   knitr                  1.49     2024-11-08 [1] CRAN (R 4.4.1)
   labeling               0.4.3    2023-08-29 [1] CRAN (R 4.4.1)
   later                  1.4.1    2024-11-27 [1] CRAN (R 4.4.2)
   lattice                0.22-6   2024-03-20 [1] CRAN (R 4.4.1)
   lifecycle              1.0.4    2023-11-07 [1] CRAN (R 4.4.1)
   limma                  3.62.1   2024-11-03 [1] Bioconductor 3.20 (R 4.4.2)
   locfit                 1.5-9.11 2025-02-03 [1] CRAN (R 4.4.2)
   magrittr               2.0.3    2022-03-30 [1] CRAN (R 4.4.1)
   MASS                   7.3-64   2025-01-04 [1] CRAN (R 4.4.2)
   Matrix                 1.7-2    2025-01-23 [1] CRAN (R 4.4.2)
   MatrixGenerics         1.18.0   2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
   matrixStats            1.5.0    2025-01-07 [1] CRAN (R 4.4.2)
   memoise                2.0.1    2021-11-26 [1] CRAN (R 4.4.1)
   metapod                1.14.0   2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
   MetBrewer              0.2.0    2022-03-21 [1] CRAN (R 4.4.1)
   mime                   0.12     2021-09-28 [1] CRAN (R 4.4.1)
   miniUI                 0.1.1.1  2018-05-18 [1] CRAN (R 4.4.1)
   msa                    1.38.0   2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
   munsell                0.5.1    2024-04-01 [1] CRAN (R 4.4.1)
 R niar                 * 0.3.0    <NA>       [?] <NA>
   patchwork              1.3.0    2024-09-16 [1] CRAN (R 4.4.1)
   pheatmap             * 1.0.12   2019-01-04 [1] CRAN (R 4.4.1)
   pillar                 1.10.1   2025-01-07 [1] CRAN (R 4.4.2)
   pkgbuild               1.4.6    2025-01-16 [1] CRAN (R 4.4.2)
   pkgconfig              2.0.3    2019-09-22 [1] CRAN (R 4.4.1)
   pkgload                1.4.0    2024-06-28 [1] CRAN (R 4.4.1)
   plyr                   1.8.9    2023-10-02 [1] CRAN (R 4.4.1)
   png                    0.1-8    2022-11-29 [1] CRAN (R 4.4.1)
   polyclip               1.10-7   2024-07-23 [1] CRAN (R 4.4.1)
   prettyunits            1.2.0    2023-09-24 [1] CRAN (R 4.4.1)
   profvis                0.4.0    2024-09-20 [1] CRAN (R 4.4.1)
   progress               1.2.3    2023-12-06 [1] CRAN (R 4.4.1)
   promises               1.3.2    2024-11-28 [1] CRAN (R 4.4.2)
   psychTools             2.4.2    2024-01-18 [1] CRAN (R 4.4.1)
   purrr                  1.0.4    2025-02-05 [1] CRAN (R 4.4.2)
   R6                     2.6.1    2025-02-15 [1] CRAN (R 4.4.2)
   rappdirs               0.3.3    2021-01-31 [1] CRAN (R 4.4.1)
   RColorBrewer           1.1-3    2022-04-03 [1] CRAN (R 4.4.1)
   Rcpp                   1.0.14   2025-01-12 [1] CRAN (R 4.4.2)
   readr                  2.1.5    2024-01-10 [1] CRAN (R 4.4.1)
   remotes                2.5.0    2024-03-17 [1] CRAN (R 4.4.1)
   rlang                  1.1.5    2025-01-17 [1] CRAN (R 4.4.2)
   rmarkdown              2.29     2024-11-04 [1] CRAN (R 4.4.1)
   rprojroot              2.0.4    2023-11-05 [1] CRAN (R 4.4.1)
   Rsamtools              2.22.0   2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
   RSQLite                2.3.9    2024-12-03 [1] CRAN (R 4.4.2)
   rstatix              * 0.7.2    2023-02-01 [1] CRAN (R 4.4.1)
   rstudioapi             0.17.1   2024-10-22 [1] CRAN (R 4.4.1)
   S4Arrays               1.6.0    2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
   S4Vectors              0.44.0   2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
   scales                 1.3.0    2023-11-28 [1] CRAN (R 4.4.1)
   seqinr                 4.2-36   2023-12-08 [1] CRAN (R 4.4.1)
   sessioninfo            1.2.3    2025-02-05 [1] CRAN (R 4.4.2)
   shiny                  1.10.0   2024-12-14 [1] CRAN (R 4.4.2)
   SparseArray            1.6.0    2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
   statmod                1.5.0    2023-01-06 [1] CRAN (R 4.4.1)
   stringi                1.8.4    2024-05-06 [1] CRAN (R 4.4.1)
   stringr                1.5.1    2023-11-14 [1] CRAN (R 4.4.1)
   SummarizedExperiment   1.36.0   2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
   tibble                 3.2.1    2023-03-20 [1] CRAN (R 4.4.1)
   tidyr                  1.3.1    2024-01-24 [1] CRAN (R 4.4.1)
   tidyselect             1.2.1    2024-03-11 [1] CRAN (R 4.4.1)
   tweenr                 2.0.3    2024-02-26 [1] CRAN (R 4.4.1)
   tzdb                   0.4.0    2023-05-12 [1] CRAN (R 4.4.1)
   UCSC.utils             1.2.0    2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
   UpSetR               * 1.4.0    2019-05-22 [1] CRAN (R 4.4.1)
   urlchecker             1.0.1    2021-11-30 [1] CRAN (R 4.4.1)
   usethis                3.1.0    2024-11-26 [1] CRAN (R 4.4.2)
   utf8                   1.2.4    2023-10-22 [1] CRAN (R 4.4.1)
   vctrs                  0.6.5    2023-12-01 [1] CRAN (R 4.4.1)
   viridis              * 0.6.5    2024-01-29 [1] CRAN (R 4.4.1)
   viridisLite          * 0.4.2    2023-05-02 [1] CRAN (R 4.4.1)
   vroom                  1.6.5    2023-12-05 [1] CRAN (R 4.4.1)
   withr                  3.0.2    2024-10-28 [1] CRAN (R 4.4.1)
   xfun                   0.51     2025-02-19 [1] CRAN (R 4.4.2)
   XICOR                  0.4.1    2023-04-21 [1] CRAN (R 4.4.1)
   xml2                   1.3.6    2023-12-04 [1] CRAN (R 4.4.1)
   xtable                 1.8-4    2019-04-21 [1] CRAN (R 4.4.1)
   XVector                0.46.0   2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)
   yaml                   2.3.10   2024-07-26 [1] CRAN (R 4.4.1)
   zlibbioc               1.52.0   2024-10-29 [1] Bioconductor 3.20 (R 4.4.2)

 [1] /users/mirimia/narecco/software/miniconda/envs/analysis/lib/R/library

 * ── Packages attached to the search path.
 R ── Package was removed from disk.

──────────────────────────────────────────────────────────────────────────────

9.1 Quarto

This document is generated using Quarto which enables to weave together content and executable code into a finished document. This is an improved version of what used to be called ‘Rmarkdown’. The document hides all code chunks, but they can be opened up with the drop down arrow. On the top right corner there’s a toggle for dark-mode. To learn more about Quarto see here.

References

Han, Hong, Ulrich Braunschweig, Thomas Gonatopoulos-Pournatzis, Robert J. Weatheritt, Calley L. Hirsch, Kevin C H Ha, Ernest Radovani, et al. 2017. Multilayered Control of Alternative Splicing Regulatory Networks by Transcription Factors. Molecular Cell 65 (3): 539–553.e7. https://doi.org/10.1016/j.molcel.2017.01.011.
Tapial, Javier, Kevin C. H. Ha, Timothy Sterne-Weiler, André Gohr, Ulrich Braunschweig, Antonio Hermoso-Pulido, Mathieu Quesnel-Vallières, et al. 2017. An atlas of alternative splicing profiles and functional associations reveals new regulatory programs and genes that simultaneously express multiple major isoforms.” Genome Research 27 (10): 1759–68. https://doi.org/10.1101/gr.220962.117.
Weatheritt, Robert J, Timothy Sterne-Weiler, and Benjamin J Blencowe. 2016. The ribosome-engaged landscape of alternative splicing.” Nature Structural & Molecular Biology 23 (12): 1117–23. https://doi.org/10.1038/nsmb.3317.